Using Components: Window

Use the Window component to apply window functions to incoming data, similar to window functions in SQL. These functions let you rank or distribute data, provide moving averages, running totals and other useful data. The output of the Window component contains all records and fields from the input data flow with the addition of the calculated window functions.

Partition by fields
Add fields to determine the partitioning of the incoming dataset to windows. Each partition contains of all the records that have the same values in the partitioning fields. If you don't add partitioning fields, the entire dataset will be considered a single window.
  1. Click + add new to add a partitioning field.
  2. Select a field in the dropdown list.
  3. Click the + button to add another partitioning field.
  4. Click the X button to remove a field from the partition by list.
Sort by fields
Add fields to determine the sorting of the data within a window. This is optional but recommended for many windowing functions (what would "first" mean without order?).
  1. Click + add new to add a sorting field.
  2. Select a field in the dropdown list.
  3. Specify sort order - ascending or descending. 
  4. Click the + button to add another sorting field.
  5. Click the X button to remove a field from the sort by list.
Window functions
Select the functions to apply to the window and fill in their arguments.
  1. Select function from the drop down (see list below)
  2. Fill in the field and arguments.
  3. Type an alias for the field containing the resulting value.
  4. Click the + button to add another function.
  5. Click the X button to remove a function. 
Functions list
  • Countreturns the number of non-null values in a field within a window range (see range arguments below).
  • Sum - returns the sum of values in a numeric field within a window range. For example, use frame between -∞ and 0 for a running total.
  • Avg - returns the average of values in a numeric field within a window range. For example, use frame between -2 and 2 for a moving average on 5 values.
  • Min - returns the minimum value in a field within a window range.
  • Max - returns the maximum value in a field within a window range.
  • Lead - returns the value from a field in a subsequent record within the window range. The location of the subsequent record (relative to the current record) is defined by the offset argument. The default argument is returned if the lead record can't be found (for example, when the current record is at the end of the window range). Example: use offset of 1 to get the value from a field in the next record.
  • Lag - returns the value from a field in a preceding record within the window range. The location of the preceding record (relative to the current record) is defined by the offset argument. The default argument is returned if the lag record can't be found (for example when the current record is at the beginning of the window range). Example: use offset of 1 to get the value from a field in the previous record.
  • First Value - returns the first value in a field within a window range. Make sure the window is sorted.
  • Last Value - returns the last value in a field within a window range. Make sure the window is sorted.
  • Row number - returns a sequential number for each record within a window range according to the window's ordering.
  • NTILE - distributes the records within a window range into a specified number of buckets set by the argument number of ntiles. For example, NTILE can be used on data partitioned by class and sorted by grades to divide each class to 4 groups of students according to their grades.
Range arguments
Per function, the range arguments further limit the records within the window. The range is relative to the current record and is defined by the number of records preceding the current record and the number of records following the current record. The default values are -∞ ("unbounded preceding" in SQL lingo) and +∞ ("unbounded following"). 0 stands for current record. 
Other examples:
  • "-2" in range before means 2 records preceding the current record are the lower boundary for the function.
  • "5" in range after means 5 records after the current record are the upper boundary for the function.

Creating packages

  1. Creating a new package
  2. Create a package from a template
  3. Working in the package designer
  4. Using Components: Facebook Ads Insights Source (Beta)
  5. Using components: File Storage Source
  6. Using components: Database Source
  7. Using components: Google AdWords Source
  8. Using components: NetSuite Source
  9. Using Components: Google Analytics Source
  10. Using Components: Google BigQuery Source
  11. Using components: Google Cloud Spanner Source
  12. Using Components: Bing Ads Source
  13. Using components: MongoDB Source
  14. Using components: Amazon Redshift Source
  15. Using Components: Rest API Source
  16. Using Components: Salesforce Source
  17. Using components: Select
  18. Using components: Sort
  19. Using components: Rank
  20. Using components: Limit
  21. Using components: Sample
  22. Using components: Join
  23. Using components: Cross Join
  24. Using components: Clone
  25. Using components: Cube and Rollup
  26. Using components: Union
  27. Using components: Filter
  28. Using Components: Window
  29. Using components: Assert
  30. Using components: Aggregate
  31. Using components: Distinct
  32. Using components: File Storage Destination
  33. Using components: Amazon Redshift Destination
  34. Using Components: Salesforce Destination (Beta)
  35. Using components: Google BigQuery Destination
  36. Using components: Google Cloud Spanner Destination
  37. Using components: Database Destination
  38. Using components: MongoDB Destination
  39. Using and setting variables in your packages
  40. Validating a package
  41. Using pattern-matching in source component paths
  42. Using ISO 8601 string functions
  43. Using Expressions in Xplenty
  44. Xplenty Functions

Feedback and Knowledge Base