Using components: Amazon Redshift Destination

Use the Amazon Redshift Destination component to store the output of your package in Amazon Redshift. This requires that you have an existing Redshift cluster in your AWS account.

The Redshift Destination component is always the last component in a package.

To define the Redshift destination:

  1. Add a Redshift Destination component at the end of your dataflow.
  2. Open the component and name it.

Destination settings

  1. Define the parameters for connecting to your Redshift as follows (for more information, see Allow Xplenty access to my Redshift cluster):
    • redshift connection - either click the drop-down arrow and select an existing connection, or click create new to create a new connection.
    • target schema - the target table's schema. If empty, the default schema is used.
    • target table - the name of the target table in your Redshift cluster.
    • create table - automatically create the table defined in the component if it doesn't exist. See below for more information.
    • add columns - automatically add columns that are mapped in the destination and do not exist in the table. See below for more information. 
    • date and time formats - the date and time formats used when loading string fields in Xplenty into date and time columns in a Redshift table. For more information on the formats, see Amazon's documentation about the COPY statement and date and time formats.
    • replacement character - When set to a character, enables loading of data into VARCHAR columns even if the data contains invalid UTF-8 characters. Each invalid UTF-8 character in the input string is replaced with the replacement character (see ACCEPTINVCHARS in Redshift's COPY documentation). Note that 0x00(NUL) characters are automatically removed by Xplenty.
    • max errors - if loading data into Redshift returns fewer errors than this value, it continues without failing.
    • null string - string fields that match this value will be replaced with NULL.
    • compression - By default (automatic), data inserted into an empty target table will be compressed only if the table columns have RAW encoding or not encoding. If you select On, data inserted into an empty target table will be compressed regardless of existing column encoding. If you select Off, automatic compression is disabled. Refer here for more information.
    • compression rows - compression sample rate. default sample rate is 100,000 records.
    • intermediate compression - Data may be stored in Amazon S3 prior to loading it into Redshift. Select whether to compress the data before storing it to Amazon S3 or not. You may gain performance by compressing the data if it's relatively large and your process is not CPU intensive, or otherwise.
    • empty as null - loads empty string fields as NULL.
    • blank as null - loads blank fields (consisting only of white space) as NULL.
    • explicit ids - use this if your target table has an IDENTITY column and you wish to override the auto-generated values with values from the dataflow.
    • truncate columns - truncates string values in order for them to fit in the target column specification.
    • round decimals - rounds up numeric values whose scale exceeds the scale of the target column.
    • Operation type - the method of data insertion
      • insert - default behaviour. Data will only be appended to the target table.
      • truncate and insert - truncate the target table before the data flow executes.
      • delete and insert - deletes all of the target table before the data flow executes. If a truncate statement can't be executed on the target table due to permissions or other constraints, you can use this instead.
      • merge - data will be inserted and/or updated in the target table according to the defined key(s). See details of the merge process below.
        Note - It's important to select the proper key columns to use in order to merge data correctly.
    • pre action sql - sql code to execute before inserting the data into the target table. If a merge operation is selected, the sql code is executed before the staging table is created.
    • post action sql - sql code to execute after inserting the data into the target table. If a merge operation is selected, the sql code is executed after the staging table is merged into the target table.
    • Then click the Test Connection button to help check that the connection is good and that the target table exists.

      Note: If you did not select 'empty as null' , 'blank as null' or use the 'null string' option - string fields in the target table will contain empty strings instead of null values.

  2. Map fields from your dataflow to columns in the target table. You can click the green Auto-fill fields icon to automatically populate the list of fields and column names.
  3. If merge operation type is used, select the proper key columns. If append operation type is used, key selection can be ignored.

Automatically creating and altering destination table

Xplenty can automatically create the destination table for you if it doesn't exist and it can also append columns to an existing table. If you define columns as key (regardless of the operation type), Xplenty defines them as the sort key in a new table. The data types in Xplenty are automatically mapped as follows. Note that since Xplenty doesn't have a notion of maximum string length, the string columns are created with the maximum length allowed in Redshift.

Xplenty Redshift
String VARCHAR(65535)
Integer INT
Float REAL

Merging data into an existing table

The merge operation will be done in theses steps:
  • The dataflow's output will be bulk copied into a temporary table with the same schema as the target table 
  • Rows with keys that exist in the bulk copied data will be deleted from the target table 
  • All rows in the temporary table will be inserted into the target table and the temporary table will be deleted. 
The entire operation is executed in a single transaction.

Xplenty recommends that Key fields should be unique.

Creating packages

  1. Creating a new package
  2. Create a package from a template
  3. Working in the package designer
  4. Using Components: Facebook Ads Insights Source (Beta)
  5. Using components: File Storage Source
  6. Using components: Database Source
  7. Using components: Google AdWords Source
  8. Using components: NetSuite Source
  9. Using Components: Google Analytics Source
  10. Using Components: Google BigQuery Source
  11. Using components: Google Cloud Spanner Source
  12. Using Components: Bing Ads Source
  13. Using components: MongoDB Source
  14. Using components: Amazon Redshift Source
  15. Using Components: Rest API Source
  16. Using Components: Salesforce Source
  17. Using components: Select
  18. Using components: Sort
  19. Using components: Rank
  20. Using components: Limit
  21. Using components: Sample
  22. Using components: Join
  23. Using components: Cross Join
  24. Using components: Clone
  25. Using components: Cube and Rollup
  26. Using components: Union
  27. Using components: Filter
  28. Using Components: Window
  29. Using components: Assert
  30. Using components: Aggregate
  31. Using components: Distinct
  32. Using components: File Storage Destination
  33. Using components: Amazon Redshift Destination
  34. Using Components: Salesforce Destination (Beta)
  35. Using components: Google BigQuery Destination
  36. Using components: Google Cloud Spanner Destination
  37. Using components: Database Destination
  38. Using components: MongoDB Destination
  39. Using and setting variables in your packages
  40. Validating a package
  41. Using pattern-matching in source component paths
  42. Using ISO 8601 string functions
  43. Using Expressions in Xplenty
  44. Xplenty Functions

Feedback and Knowledge Base