Using components: Database Destination

Use the Database Destination component to store the output of your package in a relational database table. 

The following database platforms are supported: PostgreSQL, MySQL, Microsoft SQL Server and Oracle.

To define the Database destination:

  1. Add a Database Destination component at the end of your dataflow.
  2. Open the component and name it.

Destination settings

  1. Define the parameters for connecting to your Database as follows (for more information, see Allow Xplenty access to my Database):
    • database connection - either click the drop-down arrow and select an existing connection, or click (create new) to create a new connection.
    • target schema - the target table's schema. If empty, the default schema is used.
    • target table - the name of the target table in your database.
    • create table - automatically create the table defined in the component if it doesn't exist. See below for more information.
    • add columns - automatically add columns that are mapped in the destination and do not exist in the table. See below for more information. 
    • max connectionsmaximum number of concurrent connections to open when writing to the database.
    • split by field - when using more than one connection, this field will be used to split the data to the different connections. Pick a field with low density (a unique key is best) to make sure that data split isn't skewed.
    • batch sizenumber of records that are inserted to the database in each batch (default 100).
    • transaction per batch - if checked, each batch will be committed on its own which may lead to partial data in the target table in case the job fails. Otherwise, each connection will use a single transaction. 
    • operation type - the method of data insertion
      • insert - default behavior. Data will only be appended to the target table.
      • truncate and insert - truncate the target table before the data flow executes.
      • delete and insert - delete all data in the target table before the data flow executes. Use this instead of truncate and insert If a truncate statement can't be executed on the target table due to permissions or other constraints.
      • merge (delete and insert) - incoming data is merged with existing data in the table by deleting target table data that exists in both the data sets and then inserting all the incoming data into the target table. Requires setting the merge keys correctly in field mapping. Merge is done in the following manner:
        1. First transaction - A staging table is created with a primary key according to your key mapping in the database's default schema.
        2. Second transaction - The dataflow's output is bulk copied into the staging table.
        3. Third transaction - Rows with keys that exist in the staging table are deleted from the target table. All rows in the staging table are inserted into the target table. The staging table is deleted.
        • Note: The incoming data must have unique values in the key fields you selected. A possible workaround for this issue can be to use a Limit component and add the key field/s as a partition and limit it to 1 to remove duplicates.

      • merge (update and insert) - incoming data is merged with existing data in the table by updating existing data and inserting new data. Requires setting the merge keys correctly in field mapping. Merge is done in the following manner:
        1. First transaction - A staging table is created with a primary key according to your key mapping in the database's default schema.
        2. Second transaction - The dataflow's output is bulk copied into the staging table.
        3. Third transaction - Database specific command is issued to update existing records and insert. The staging table is deleted.
          PlatformSQL command
          PostgreSQL (9.5+)
          INSERT INTO <target> ("k1","c2",...,"cn") 
          SELECT "k1","c2",...,"cn" FROM <staging>
          ON CONFLICT ("k1") DO UPDATE SET 
              "c2" = excluded."c2", ...
              "cn" = excluded."cn";
          
          MySQL
          INSERT INTO <target>  (`k1`, `c2`, ..., `cn`)
          SELECT `k1`, `k2`, `c2`, `c3`, `c4` FROM <staging> AS stg
          ON DUPLICATE KEY UPDATE
              `c2` = stg.`c2`, ...
              `cn` = stg.`cn`;
          
          Microsoft SQL Server
          MERGE INTO  <target> as tgt
          USING <staging> as stg
          ON 
              tgt."k1" = stg."k1"
          WHEN MATCHED THEN
              UPDATE SET 
                  tgt."c2" = stg."c2", ...
                  tgt."cn" = stg."cn"
          WHEN NOT MATCHED BY TARGET THEN
              INSERT ("k1","c2",...,"c4")
              VALUES (stg."k1",stg."c2",...,stg."c4");
          
          Oracle
          MERGE INTO <target> AS tgt
          USING <staging>  AS stg
          ON 
              (tgt."k1" = stg."k1")
          WHEN MATCHED THEN
              UPDATE SET 
                  tgt."c2" = stg."c2", ...
                  tgt."cn" = stg."cn"
          WHEN NOT MATCHED THEN
              INSERT ("k1",c2",...,"cn")
              VALUES (stg."k1",stg."c2",...,stg."cn");
          
          Amazon Redshift
          START TRANSACTION;
          
          UPDATE <target>
          SET
              c2 =  <staging>.c2, ...
              cn =  <staging>.cn
          FROM <staging> 
          WHERE 
              <target>."k1" =  <staging>."k1";
          
          DELETE FROM  <staging>
          USING <target>
          WHERE  <staging>."k1" = <target>."k1";
          
          INSERT INTO <target> ("k1","k2","c2",...,"cn")
          SELECT "k1","c2",...,"cn"
          FROM  <staging>;
          
          COMMIT;
          
        • Note: The incoming data must have unique values in the key fields you selected. A possible workaround for this issue can be to use a Limit component and add the key field/s as a partition and limit it to 1 to remove duplicates.

    • pre action sql - sql code to execute before inserting the data into the target table. If a merge operation is selected, the sql code is executed before the staging table is created.
    • post action sql - sql code to execute after inserting the data into the target table. If a merge operation is selected, the sql code is executed after the staging table is merged into the target table.
    • Then click the Test Connection button to help check that the connection is good and that the target table exists.

    • Map fields from your dataflow to columns in the target table. You can click the green Auto-fill fields icon to automatically populate the list of fields and column names. When mapping fields to table columns, take into consideration the mapping of data types. If you map mismatched data types, the job will fail. The following table displays matching the matching data types:
Xplenty PostgreSQL MySQL Microsoft
SQL Server
String varchar, char, text,  interval varchar, nvarchar, text,  varchar, nvarchar, text, ntext
Integer smallint, int bit, bool, tinyint, smallint, mediumint, int, integer tinyint, smallint, int
Long bigint bigint bigint
Float decimal, real decimal, float decimal, numeric, float
Double double precision double real
DateTime timestamp, date, time date, datetime, timestamp, time datetime, datetime2, smalldatetime, date, time, datetimeoffset
Boolean TINYINT(1) BOOLEAN BIT

    * Note - Use the ToDate function to cast a datetime string expression to datetime data type. Note that a datetime with timezone offset value will be adjusted to UTC when inserted into a database

Automatically creating and altering destination table

Xplenty can automatically create the destination table for you if it doesn't exist and it can also append columns to an existing table. If you define columns as key (regardless of the operation type), Xplenty defines them as the primary key in a new table. The data types in Xplenty are automatically mapped as follows. Note that since Xplenty doesn't have a notion of maximum string length, the string columns are created with the maximum length allowed in the database.

Xplenty PostgreSQL MySQL Microsoft
SQL Server
Oracle
String VARCHAR(65535) TEXT NVARCHAR(MAX) NCLOB
Integer INT INT INT NUMBER
Long BIGINT BIGINT BIGINT NUMBER
Float REAL FLOAT REAL NUMBER
Double DOUBLE PRECISION DOUBLE FLOAT NUMBER
DateTime TIMESTAMP DATETIME DATETIME TIMESTAMP
Boolean BOOLEAN TINYINT(1) BIT NUMBER(1)

Feedback and Knowledge Base