Supported Aggregations
Pinot provides support for aggregations using GROUP BY. You can use the following functions to get the aggregated value.
Function  Description  Example 
COUNT  Get the count of rows in a group 

MIN  Get the minimum value in a group 

MAX  Get the maximum value in a group 

SUM  Get the sum of values in a group 

AVG  Get the average of the values in a group 

MODE  Get the most frequent value in a group. When multiple modes are present it gives the minimum of all the modes. This behavior can be overridden to get the maximum or the average mode. 

MINMAXRANGE  Returns the 

PERCENTILE(column, N)  Returns the Nth percentile of the group where N is a decimal number between 0 and 100 inclusive 

PERCENTILEEST(column, N)  Returns the Nth percentile of the group using Quantile Digest algorithm 

PERCENTILETDigest(column, N)  Returns the Nth percentile of the group using Tdigest algorithm 

DISTINCT  Returns the distinct row values in a group 

DISTINCTCOUNT  Returns the count of distinct row values in a group 

DISTINCTCOUNTBITMAP  Returns the count of distinct row values in a group. This function is accurate for INT column, but approximate for other cases where hash codes are used in distinct counting and there may be hash collisions. 

DISTINCTCOUNTHLL  Returns an approximate distinct count using HyperLogLog. It also takes an optional second argument to configure the log2m for the HyperLogLog. 

DISTINCTCOUNTRAWHLL  Returns HLL response serialized as string. The serialized HLL can be converted back into an HLL and then aggregated with other HLLs. A common use case may be to merge HLL responses from different Pinot tables, or to allow aggregation after clientside batching. 

FASTHLL (Deprecated)  WARN: will be deprecated soon. FASTHLL stores serialized HyperLogLog in String format, which performs worse than DISTINCTCOUNTHLL, which supports serialized HyperLogLog in BYTES (byte array) format 

DISTINCTCOUNTTHETASKETCH  
DISTINCTCOUNTRAWTHETASKETCH  
SEGMENTPARTITIONEDDISTINCTCOUNT  Returns the count of distinct values of a column when the column is prepartitioned for each segment, where there is no common value within different segments. This function calculates the exact count of distinct values within the segment, then simply sums up the results from different segments to get the final result. 

Multivalue column functions
The following aggregation functions can be used for multivalue columns
Function  Description  Example 
COUNTMV  Get the count of rows in a group 

MINMV  Get the minimum value in a group 

MAXMV  Get the maximum value in a group 

SUMMV  Get the sum of values in a group 

AVGMV  Get the avg of values in a group 

MINMAXRANGEMV  Returns the 

PERCENTILEMV(column, N)  Returns the Nth percentile of the group where N is a decimal number between 0 and 100 inclusive 

PERCENTILEESTMV(column, N)  Returns the Nth percentile of the group using Quantile Digest algorithm 

PERCENTILETDIGESTMV(column, N)  Returns the Nth percentile of the group using Tdigest algorithm 

DISTINCTCOUNTMV  Returns the count of distinct row values in a group 

DISTINCTCOUNTBITMAPMV  Returns the count of distinct row values in a group. This function is accurate for INT or dictionary encoded column, but approximate for other cases where hash codes are used in distinct counting and there may be hash collision. 

DISTINCTCOUNTHLLMV  Returns an approximate distinct count using HyperLogLog in a group 

DISTINCTCOUNTRAWHLLMV  Returns HLL response serialized as string. The serialized HLL can be converted back into an HLL and then aggregated with other HLLs. A common use case may be to merge HLL responses from different Pinot tables, or to allow aggregation after clientside batching. 

FASTHLLMV (Deprecated)  stores serialized HyperLogLog in String format, which performs worse than DISTINCTCOUNTHLL, which supports serialized HyperLogLog in BYTES (byte array) format 

Last updated