Integrate.io ETL's packages

Packages define the data flow using components that specify the data to process, the data manipulation to perform, and the output destinations. Each package requires at least one source and one destination.
Once you define a package, you can verify it, and, as in any development lifecycle, fix any errors and re-verify until the package is ready to run as a job on a cluster.

Click the following links for information on using packages:

Creating a new package
Creating a new package from a template
Working in the package designer
Using and setting variables in your packages
Validating a package
Using pattern matching in source component paths
Using ISO 8601 date/time functions
Using functions in components

Components:

Component	Description
Amazon Redshift Source	Read data stored in an Amazon Redshift table, view or using a query.
Bing Ads Source	Read Bing Ads report data.
Database Source	Read data stored in a database table, view or using a query.
Facebook Ads Insights Source	Read Facebook Ads Insights reports data.
File Storage Source	Read data stored in a file or multiple files in object stores such as Amazon S3, Google Cloud Storage or Azure Blob Storage or file servers such as SFTP.
Google Ads source	Read Google Ads report data.
Google Analytics Source	Read Google Analytics report data.
Google Analytics (GA4) Source	Read Google Analytics 4 (GA4) report data.
Google BigQuery Source	Read data stored in a Google BigQuery table or using a query.
Google Cloud Spanner Source	Read data stored in a Google Cloud Spanner table or using a query.
MongoDB Source	Read data stored in a MongoDB collection.
NetSuite Source	Read NetSuite standard and custom records (tables) using the NetSuite JDBC drivers (SuiteAnalytics Connect).
Salesforce source	Read Salesforce sales cloud standard and custom objects using the Bulk API.
Rest API Source	Read data from HTTP endpoints such as Rest Web Services. Use the Rest API source component to define the authentication method, request parameters and response fields to use in the package.
Aggregate Transformation	Use the Aggregate transformation to group the input dataset by one or more fields and use aggregate functions such as Count, Average, Minimum, Maximum, etc
Assert Transformation	Use the Assert transformation to make sure that all data in the source complies with the conditions you specify in the component. If a record does not comply, the job fails and a message is added to the error log.
Clone transformation	Use the Clone component to split a dataflow into two dataflows in order to apply multiple transformations to the same data.
Cross Join Transformation	Use the Cross Join transformation to combine records from two different inputs. The cross join returns the Cartesian product of records from the two inputs. That is, it will produce records that combine each record from the left input with each record from the right input.
Distinct Transformation	Use the Distinct transformation to filter out duplicate records that have the same values in all fields, leaving only unique records. For example, you might need to filter out users' double-clicks in events.
Filter Transformation	Use the Filter transformation to filter input data by defining conditions that must be met by records in the input.
Join Transformation	Use the Join transformation to combine records from two different inputs. The join component can be used to add information from one data source to another data source or to filter data that exists in both data sources or exists in only one of them.
Limit Transformation	Use the Limit transformation to limit the number of records in the output for the entire dataset or per partition or group within the data set.
Rank Transformation	Use the Rank component to sort input data by one or more fields, in an ascending or descending order and add a rank field that reflects the sort order.
Select Transformation	Use the Select transformation to choose which fields from the input will be available in the next component and transform them using expressions in order to parse input data, enrich it, extract information from it or manipulate it.
Sort Transformation	Use the Sort component to sort input data by one or more fields, in an ascending or descending order.
Union Transformation	Use the Union transformation to combine records from two inputs with the same schema (same fields and data types).
Window Transformation	Use the Window component to apply window functions to incoming data, similar to window functions in SQL. These functions let you rank or distribute data, provide moving averages, running totals and other useful data. The output of the Window component contains all records and fields from the input data flow with the addition of the calculated window functions.
Sample Transformation	Use the Sample component to return a percentage of random records from the input
Cube transformation	Use the Cube and Rollup component to group the input dataset by combinations of fields and use aggregate functions such as Count, Average, Minimum, Maximum, etc.
Amazon Redshift Destination	Use the Cube and Rollup component to group the input dataset by combinations of fields and use aggregate functions such as Count, Average, Minimum, Maximum, etc.
Database Destination	Use the database destination component to store the output of a data flow in a relational database table.
File Storage Destination	Use the File storage destination component to store the output of a data flow into files in a designated directory on a file server (SFTP, HDFS) or object store (Amazon S3, Google Cloud Storage, Azure Blob Storage).
Google BigQuery Destination	Use the Google BigQuery destination component to store the output of a data flow in a BigQuery table.
Google Spanner Destination	Use the Google Spanner destination component to store the output of a data flow in a Google Spanner table.
MongoDB Destination	Use the MongoDB destination component to store the output of a data flow in a MongoDB collection.
Salesforce Destination	Use the Salesforce destination component to store the output of a data flow in Salesforce Sales cloud object.
Snowflake Destination	Use the Snowflake destination component to store the output of a data flow in a Snowflake table.
Salesforce SOAP Destination	Use the Salesforce SOAP destination component to store the output of a data flow in Salesforce Sales cloud object using Salesforce SOAP connection.
Netsuite SOAP Destination	Use the Netsuite SOAP destination component to store the output of a data flow in Netsuite cloud object using Netsuite SOAP connection.
Facebook Ads Destination	Use the Facebook ads destination component to store the output of a data flow in Facebook ads cloud object.
Google Ads Destination	Use the Google ads destination component to store the output of a data flow in Google ads cloud object.
Tiktok Ads Destination	Use the Tiktok destination component to store the output of a data flow in Tiktok ads cloud object.
HubSpot Destination	Use the HubSpot destination component to store the output of a data flow in HubSpot cloud object.

ETL & Reverse ETL
Knowledge Base

ETL & Reverse ETL Knowledge base

Getting started

6 Articles

How Do I ...

11 Articles

Connectivity And Security

48 Articles

Creating packages

54 Articles

Using clusters

4 Articles

Running and monitoring jobs

8 Articles

Configuring your Integrate.io ETL environment

13 Articles

Programming and API

5 Articles

Other

189 Articles

New Releases

18 Articles

Integrate.io ETL's packages

Solutions

Support

Company

Language

ETL & Reverse ETL Knowledge Base

ETL & Reverse ETL Knowledge base

Getting started

6 Articles

How Do I ...

11 Articles

Connectivity And Security

48 Articles

Creating packages

54 Articles

Using clusters

4 Articles

Running and monitoring jobs

8 Articles

Configuring your Integrate.io ETL environment

13 Articles

Programming and API

5 Articles

Other

189 Articles

New Releases

18 Articles

Integrate.io ETL's packages

See Also

Solutions

Support

Company

Language

ETL & Reverse ETL
Knowledge Base