Use the Limit transformation to limit the number of records in the output for the entire dataset or per partition or group within the data set. This transformation allows you to easily eliminate duplicates (per key values), get top X items per category or just limit the input data to a subset of data for testing purposes.
treat the entire input as a single partition to limit the output of the entire input. Select
partition data by fields to limit the output per group of records in the input with identical values in selected fields.
don't sort input data if you'd like to output arbitrarily selected records per partition. Select
sort partition by fields to select fields to sort each partition by and the direction (ascending or descending).
Finally, select the number of output records per partition.