Supported Aggregations

Pinot provides support for aggregations using GROUP BY. You can use the following functions to get the aggregated value.

Function

Description

Example

COUNT

Get the count of rows in a group

COUNT(*)

MIN

Get the minimum value in a group

MIN(playerScore)

MAX

Get the maximum value in a group

MAX(playerScore)

SUM

Get the sum of values in a group

SUM(playerScore)

AVG

Get the average of the values in a group

AVG(playerScore)

MINMAXRANGE

Returns the min and max value in a group

MINMAXRANGE(playerScore)

PERCENTILE[0-100]

Returns the Nth percentile of the group where N is between 0-100

PERCENTILE50(playerScore), PERCENTILE99(playerScore)

PERCENTILEEST[0-100]

Returns the Nth percentile of the group using Quantile Digest algorithm

PERCENTILEEST50(playerScore), PERCENTILEEST99(playerScore)

PercentileTDigest[0-100]

Returns the Nth percentile of the group using T-digest algorithm

PERCENTILETDIGEST50(playerScore), PERCENTILETDIGEST99(playerScore)

DISTINCT

Returns the distinct row values in a group

DISTINCT(playerName)

DISTINCTCOUNT

Returns the count of distinct row values in a group

DISTINCTCOUNT(playerName)

DISTINCTCOUNTHLL

Returns an approximate distinct count using HyperLogLog in a group

DISTINCTCOUNTHLL(playerName)

DISTINCTCOUNTRAWHLL

Returns HLL response serialized as string. The serialized HLL can be converted back into an HLL and then aggregated with other HLLs. A common use case may be to merge HLL responses from different Pinot tables, or to allow aggregation after client-side batching.

DISTINCTCOUNTRAWHLL(playerName)

FASTHLL (Deprecated)

WARN: will be deprecated soon. FASTHLL stores serialized HyperLogLog in String format, which performs worse than DISTINCTCOUNTHLL, which supports serialized HyperLogLog in BYTES (byte array) format

FASTHLL(playerName)

DistinctCountThetaSketch

See Cardinality Estimation

DistinctCountRawThetaSketch

See Cardinality Estimation

Multi-value column functions

The following aggregation functions can be used for multi-value columns

Function

Description

Example

COUNTMV

Get the count of rows in a group

COUNTMV(playerName)

MINMV

Get the minimum value in a group

MINMV(playerScores)

MAXMV

Get the maximum value in a group

MAXMV(playerScores)

SUMMV

Get the sum of values in a group

SUMMV(playerScores)

AVGMV

Get the avg of values in a group

AVGMV(playerScores)

MINMAXRANGEMV

Returns the min and max value in a group

MINMAXRANGEMV(playerScores)

PERCENTILE[0-100]MV

Returns the Nth percentile of the group where N is between 0-100

PERCENTILE50MV(playerScores),

PERCENTILE99MV(playerScores)

PERCENTILEEST[0-100]MV

Returns the Nth percentile of the group using Quantile Digest algorithm

PERCENTILEEST50MV(playerScores),

PERCENTILEEST99MV(playerScores)

PercentileTDigest[0-100]MV

Returns the Nth percentile of the group using T-digest algorithm

PERCENTILETDIGEST50MV(playerScores),

PERCENTILETDIGEST99MV(playerScores),

DISTINCTCOUNTMV

Returns the count of distinct row values in a group

DISTINCTCOUNTMV(playerNames)

DISTINCTCOUNTHLLMV

Returns an approximate distinct count using HyperLogLog in a group

DISTINCTCOUNTHLLMV(playerNames)

DISTINCTCOUNTRAWHLLMV

Returns HLL response serialized as string. The serialized HLL can be converted back into an HLL and then aggregated with other HLLs. A common use case may be to merge HLL responses from different Pinot tables, or to allow aggregation after client-side batching.

DISTINCTCOUNTRAWHLLMV(playerNames)

FASTHLLMV (Deprecated)

stores serialized HyperLogLog in String format, which performs worse than DISTINCTCOUNTHLL, which supports serialized HyperLogLog in BYTES (byte array) format

FASTHLLMV(playerNames)