System and pre-defined variables

Pre-defined Variables

Pre-defined variables are assigned a value in runtime (for example Job ID) which you can use in conjunction with other variables within the flow.

Variable nameDescription
_CLUSTER_ID_S3_ESCAPEDThe S3 escaped ID of the cluster on which the job is running.
_CLUSTER_IDThe ID of the cluster on which the job is running.
_CLUSTER_NODES_COUNTThe number of nodes in the cluster that executes the job.
_JOB_IDXplenty Job Identifier. For tasks executed from within a workflow the JOB_ID format is NNNNNN_component_abcdef
_JOB_ID_S3_ESCAPEDS3 escaped Xplenty Job Identifier
_JOB_SUBMISSION_TIMESTAMPISO-8601 date-time value of the time the job was submitted in UTC. For example: 2013-04-22T14:18:17Z
_JOB_SUBMISSION_TIMESTAMP_S3_ESCAPEDS3 escaped date-time value of the time the job was submitted in UTC. For example: 2013-01-09T14-52-21Z
_JOB_SUBMITTER_EMAILThe email address of the user who submitted the job. For example: helpdesk@xplenty.com
_JOB_SUBMITTER_EMAIL_S3_ESCAPEDThe S3 escaped email address of the user who submitted the job. For example: helpdesk-xplenty-com
_PACKAGE_OWNER_EMAILThe email address of the user who owns the package.
_PACKAGE_OWNER_EMAIL_S3_ESCAPEDThe S3 escaped email address of the user who owns the package.
_ACCOUNT_IDThe internal id of the account under which the package and job were created.
_ACCOUNT_ID_S3_ESCAPEDThe S3 escaped internal id of the account under which the package and job were created.
_ACCOUNT_NAMEThe name of the account under which the package and job were created.
_ACCOUNT_NAME_S3_ESCAPEDThe S3 escaped name of the account under which the package and job were created.
_PACKAGE_LAST_SUCCESSFUL_JOB_SUBMISSION_TIMESTAMPThe timestamp (in ISO-8601 date-time format) of the last successful job for the same package in UTC. This variable can be used to read data is newer than the previously executed job in incremental loads.
_PARENT_JOB_IDXplenty job identifier of parent job (workflow job executing current task).
_PARENT_JOB_ID_S3_ESCAPEDS3 escaped Xplenty job identifier of parent job.

System Variables

System variables change the behaviour or the package.

Variable nameDescription
_ADWORDS_API_MAX_INPUT_SPLITS maximum number of concurrent Google Adwords requests.
_ADWORDS_API_REQUEST_READ_TIMEOUTrequest timeout (in milliseconds) for Google Adwords source components.
_ADWORDS_API_SKIP_BAD_ACCOUNTSset to true in order to have a package complete successfully when an adwords customer id is inaccessible (with Google Adwords source components).
_BQ_READER_MAX_SHARDSSet to 1 if Google BigQuery returns the error: Exporting to multiple wildcard URIs...
_BQ_READER_POLL_INTERVALSets the interval in milliseconds between retries when polling data export from Google BigQuery.
_BQ_READER_POLL_RETRIESControls the number of retries when polling data export from Google BigQuery.
_BYTES_PER_REDUCERthe string size, in bytes, that can be read for each record. Larger records are ignored.
_CACHED_BAG_MEMORY_PERCENTPercentage of the heap allocated for all bags in a map or reduce task. When the amount is filled, data is spilled to disk. Higher value reduces spills to disk but increases likelihood of running out of heap memory.
_COPY_TARGET_PARTITIONSControls how many partitions the data is divided to by the copy pre-process action. Setting this variable's value to 0 forces the process not to merge files.
_COPY_TARGET_SIZEControls the maximum size per file in partition for files that are concatenated by the copy pre-process action.
_COPY_PARALLELISMControls how many processes are used in the copy pre-process action.
_DEFAULT_TIMEZONEdefault time zone for date-time datatype fields.
_DEFAULT_PARALLELISMSets the default number of parallel reduce tasks to use in the package. Generally speaking, the number of reducers depend on the size of your data and it's distribution. If your data is relatively big but skewed (for example, when you aggregate by a field, most records fall into one group), adding more reducers will not have a positive effect on performance. The default value is 0, which means that the number of reducers is being calculated by _BYTES_PER_REDUCER.
_FB_ASYNC_REPORT_TIMEOUTRequest timeout in seconds for a Facebook Ads Insights source async report request (per attempt). If this value is exceeded, the attempt fails (default - no timeout).
_FACEBOOK_ADS_INSIGHTS_SLEEP interval (ms) between retry attempts when trying to get Facebook Ads Insights report (default - 0)
_FS_IGNORE_MISSING_INPUT_EXCEPTIONSset to true in order to have the package complete successfully when no input is found in source path (with file storage source components).
_FS_SFTP_BLOCK_SIZEdetermines the size of block read by a task from SFTP. Change to a value greater than default when reading large files and SFTP server doesn't allow reading file starting at an offset>0.
_FS_SFTP_MAX_RETRIESnumber of retry attempts when trying to find files or directories in SFTP (default - 5).
_FS_SFTP_RETRY_SLEEP interval (ms) between retry attempts when trying to find files or directories in SFTP (default - 500)
_GA_API_MAX_INPUT_SPLITS maximum number of concurrent Google Analytics requests.
_GA_API_SKIP_BAD_PROFILES set to true in order to have a package complete successfully when Google Analytics profile id is inaccessible.
_GA_API_REQUEST_MAX_RESULTSmaximum results per page for Google Analytics source components.
_GA_API_REQUEST_READ_TIMEOUTrequest timeout (in milliseconds) for Google Analytics source components.
_HTTP_FOLLOW_REDIRECTS set to false if you would like the REST API source or *Curl* functions not to follow redirect status code.
_HTTP_REQUEST_MAX_RETRIES number of retries REST API source or *Curl* functions attempt when receiving response code 429 or 5xx before throwing an exception.
_JDBC_SPLIT_QUERY_RETRIESnumber of retry attempts to get the min and max values for the key in database source parallel queries.
_JDBC_SPLIT_QUERY_RETRIES_INTERVAL_IN_SECinterval to wait between retry attempts to get the min and max values for the key in database source parallel queries.
_INTERMEDIATE_COMPRESSION compression for intermediate results. Defaults to false.
_LINE_RECORD_READER_MAX_LENGTHmaximum length in byte for lines read from files. Lines longer than this value will be discraded.
_MAP_MAX_ATTEMPTSnumber of times to try to execute a map task before failing the job.
_MAP_MAX_FAILURES_PERCENTControls the maximum percentage of map tasks that are allowed to fail without triggering job failure. The value range is 0-100.
_MAP_TASK_TIMEOUTnumber of milliseconds before task is killed if the task doesn't update status.
_MAX_COMBINED_SPLIT_SIZEamount of data, in bytes, to be processed by a single task. Smaller files are combined until this size is reached. Larger files are split if they are uncompressed or compressed using Bzip2.
_PARQUET_BLOCK_SIZESize of a row group being buffered in memory for Apache Parquet.
_PARQUET_COMPRESSIONCompression type for Apache Parquet. Available values are: UNCOMPRESSED, GZIP, SNAPPY.
_PARQUET_PAGE_SIZEPage size for Apache Parquet compression.
_REDUCER_MAX_ATTEMPTSnumber of times to try to execute a map task before failing the job.
_REDUCER_MAX_FAILURES_PERCENTmaximum percentage of reduce tasks that are allowed to fail without triggering job failure. The value range is 0-100.
_SHUFFLE_INPUT_BUFFER_PERCENTThe percentage of memory to be allocated from the maximum heap size to storing map outputs during the shuffle.
_SPANNER_BATCH_SIZEdetermines the batch size in Google Spanner destination. Default batch size is 100
_SQL_COMMAND_TIMEOUT_IN_SECcommand timeout (in seconds) for SQL statements before failing/retrying. By default, SQL commands do not timeout.
_SYNC_WAIT_TIMEtime in seconds to wait between staging data for an Amazon Redshift destination and executing COPY on the Redshift cluster.
_TIMEOUT_IN_SECONDS Number of seconds after which the job will be stopped. Allows you to automatically stop the job if it's taking too long. By default- no timeout is set (the job will run indefinitely). Note that the timeout is defined at the job level (and not on the package level).

Creating packages

  1. Creating a new package in New Xplenty
  2. Creating a workflow
  3. Working in the new package designer
  4. Validating a package
  5. Using components: Amazon Redshift Source
  6. Using components: Bing Ads Source
  7. Using components: Database Source
  8. Using components: Facebook Ads Insights Source
  9. Using components: File Storage Source
  10. Using components: Google Adwords source
  11. Using components: Google Analytics Source
  12. Using components: Google BigQuery Source
  13. Using components: Google Cloud Spanner Source
  14. Using components: MongoDB Source
  15. Using components: NetSuite Source
  16. Using components: Salesforce source
  17. Using components: Rest API Source
  18. Using components: Aggregate Transformation
  19. Using components: Assert Transformation
  20. Using components: Clone transformation
  21. Using components: Cross Join Transformation
  22. Using components: Distinct Transformation
  23. Using components: Filter Transformation
  24. Using components: Join Transformation
  25. Using components: Limit Transformation
  26. Using components: Rank Transformation
  27. Using components: Select Transformation
  28. Using components: Sort Transformation
  29. Using components: Union Transformation
  30. Using components: Window Transformation
  31. Using components: Sample Transformation
  32. Using components: Cube transformation
  33. Using components: Amazon Redshift Destination
  34. Using components: Database Destination
  35. Using components: File Storage Destination
  36. Using components: Google BigQuery Destination
  37. Using components: Google Spanner Destination
  38. Using components: MongoDB Destination
  39. Using components: Salesforce Destination
  40. Using components: Snowflake Destination (beta)
  41. Using and setting variables in your packages
  42. System and pre-defined variables
  43. Using pattern-matching in source component paths
  44. Using ISO 8601 string functions
  45. Using Expressions in Xplenty
  46. Xplenty Functions

Feedback and Knowledge Base