Aggregation Functions
Aggregate functions return a single result for a group of rows.
Aggregate functions return a single result for a group of rows. The pages below provide detailed signatures, usage examples, and notes for each function.
Basic Aggregations
Count of rows
Most frequent value
Histogram of values
High-precision sum using BigDecimal
Any arbitrary non-null value from a group
Array and String Aggregations
Collect values into an array
Concatenate values into a delimited string
Element-wise sum of long arrays
Element-wise sum of double arrays
Statistical Functions
Skewness of a distribution
Kurtosis of a distribution
Sum/average of distinct values
Project column at row with min/max measure
Set and Sketch Functions
Build an IdSet for efficient filtering
For sketch-based functions (FrequentItems, CPC, HLL+, ULL, Tuple), see Sketch Functions. For funnel analysis functions, see Funnel Functions.
Distinct Counting
Exact distinct values
Exact distinct count
Distinct count using bitmap
Approximate distinct count using HLL
Distinct count using Theta sketch
Hybrid exact/HLL distinct count
For sketch-based distinct count functions (CPC, HLL+, ULL, Tuple), see Sketch Functions.
Percentile Functions
Exact percentile
Estimated percentile
Percentile using KLL sketch
Percentile using T-Digest
Window-Related Functions
First/last value in window
First/last value by time
Multi-Value Column Functions
The following aggregation functions can be used for multi-value columns:
Returns the count of a multi-value column
Returns the minimum value of a numeric multi-value column
Returns the maximum value of a numeric multi-value column
Returns the sum of the values for a numeric multi-value column
Returns the average of the values for a numeric multi-value column
Returns the max - min value for a numeric multi-value column
Returns the Nth percentile of a numeric multi-value column
Returns the Nth percentile using Quantile Digest for a multi-value column
Returns the Nth percentile using T-Digest for a multi-value column
Returns the count of distinct values for a multi-value column
Returns the count of distinct values using bitmap for a multi-value column
Returns an approximate distinct count using HLL for a multi-value column
Returns HyperLogLog response serialized as string for a multi-value column
Returns the sum of distinct values of a numeric multi-value column
Returns the average of distinct values of a numeric multi-value column
For sketch-based multi-value functions (HLL+), see Sketch Functions.
FILTER Clause in Aggregation
Pinot supports the FILTER clause in aggregation queries:
In the query above, COL1 is aggregated only for rows where COL2 > 300 and COL3 > 50. Similarly, COL2 is aggregated where COL2 < 50 and COL3 > 50.
With NULL Value Support enabled, this allows filtering out null values while performing aggregation:
Deprecated Functions
FASTHLL
Stores serialized HyperLogLog in String format. Performs worse than DISTINCTCOUNTHLL, which supports serialized HyperLogLog in BYTES format.
FASTHLLMV
Multi-value version of FASTHLL. Also deprecated in favor of DISTINCTCOUNTHLL.
Last updated
Was this helpful?

