Using components: Aggregate

Use the Aggregate component to group the input dataset by one or more fields and use aggregate functions such as Count, Average, Minimum, Maximum, etc. For example, you may want to count the number of users in each country that downloaded a file.

To aggregate records:

  1. Add an Aggregate component where required in your package.
  2. Open the component and name it.
  3. Under group by, select the fields on which to perform summary functions. The group by fields return the unique records for those fields.
  4. Under function, field and projected field (for Min By and Max By), select the aggregate function you want to apply (according to the groupings you specified under group by) as follows:
    • Count - returns the number of non-null values in the field you specify in the field column, according to the groupings. Return value data type is long.
    • Count Distinct - returns the number of unique values in the field you specify in the field column, according to the groupings. Return value data type is long.
    • Count All - returns the number of records, according to the groupings. Return value data type is long.
    • HLL - uses the HyperLogLog++ algorithm to return a cardinality estimate or an approximate number of distinct values in the field you specify, according to the groupings. Return value data type is long.
    • Average - returns the average for numeric fields you specify in the field column, according to the groupings. See the following table for return value data types:
      Argument field data typeReturn value data type
      int, longlong
      float, doubledouble
    • Sum - returns the sum for numeric fields you specify in the field column, according to the groupings. See the following table for return value data types:
      Argument field data typeReturn value data type
      int, longlong
      float, doubledouble
    • Min - returns the minimum value for the field you specify in the field column, according to the groupings. Return value data type is the same as the input argument's data type.
    • Min By - for the minimum value in the field you specify in the field column, and according to the groupings, returns the value defined by projected field. Return value data type is the same as the projected field's data type.
    • Max - calculates the maximum value for the field you specify in the field column, according to the groupings. Return value data type is the same as the input argument's data type.
    • Max By - for the maximum value in the field you specify in the field column, and according to the groupings, returns the value defined by projected field. Return value data type is the same as the projected field's data type.
    • VAR - returns the statistical variance for all values in the field you specify in the field column and according to the groupings. Return value data type is double.
    • VARP - returns the statistical variance for the population of all values in the field you specify in the field column and according to the groupings. Return value data type is double.
    • STDEV - returns the statistical standard deviation for all values in the field you specify in the field column and according to the groupings. Return value data type is double.
    • STDEVP - returns the statistical standard deviation for the population of all values in the field you specify in the field column and according to the groupings. Return value data type is double.
    • Collect - returns a collection (bag) of the values in the field you specify in the field column, according to the groupings. The bag can be manipulated further in a Select component using bag functions. Returned data type is bag.
  5. Pick the field(s) to apply the function to.
  6. Type an alias for the field that contains the resulting value for the function.
  7. Add another function if required.

Creating packages

  1. Creating a new package
  2. Create a package from a template
  3. Working in the package designer
  4. Using Components: Facebook Ads Insights Source (Beta)
  5. Using components: File Storage Source
  6. Using components: Database Source
  7. Using components: Google AdWords Source
  8. Using components: NetSuite Source
  9. Using Components: Google Analytics Source
  10. Using Components: Google BigQuery Source
  11. Using components: Google Cloud Spanner Source
  12. Using Components: Bing Ads Source
  13. Using components: MongoDB Source
  14. Using components: Amazon Redshift Source
  15. Using Components: Rest API Source
  16. Using Components: Salesforce Source
  17. Using components: Select
  18. Using components: Sort
  19. Using components: Rank
  20. Using components: Limit
  21. Using components: Sample
  22. Using components: Join
  23. Using components: Cross Join
  24. Using components: Clone
  25. Using components: Cube and Rollup
  26. Using components: Union
  27. Using components: Filter
  28. Using Components: Window
  29. Using components: Assert
  30. Using components: Aggregate
  31. Using components: Distinct
  32. Using components: File Storage Destination
  33. Using components: Amazon Redshift Destination
  34. Using Components: Salesforce Destination (Beta)
  35. Using components: Google BigQuery Destination
  36. Using components: Google Cloud Spanner Destination
  37. Using components: Database Destination
  38. Using components: MongoDB Destination
  39. Using and setting variables in your packages
  40. Validating a package
  41. Using pattern-matching in source component paths
  42. Using ISO 8601 string functions
  43. Using Expressions in Xplenty
  44. Xplenty Functions

Feedback and Knowledge Base