Aggregation Functions
Function | Description | Example | Default Value When No Record Selected |
---|---|---|---|
Returns the count of the records as |
|
| |
Calculate the histogram of a numeric column as |
|
| |
Returns the minimum value of a numeric column as |
|
| |
Returns the maximum value of a numeric column as |
|
| |
Returns the sum of the values for a numeric column as |
|
| |
Returns the sum of the values for a numeric column with optional precision and scale as |
|
| |
Returns the average of the values for a numeric column as |
|
| |
Returns the most frequent value of a numeric column as |
|
| |
Returns the |
|
| |
Returns the Nth percentile of the values for a numeric column as |
|
| |
Returns the Nth percentile of the values for a numeric column using Quantile Digest as |
|
| |
Returns the Nth percentile of the values for a numeric column using T-digest as |
|
| |
PERCENTILESMARTTDIGEST | Returns the Nth percentile of the values for a numeric column as |
|
|
Returns the count of distinct values of a column as |
|
| |
Returns the count of distinct values of a column as |
|
| |
Returns an approximate distinct count using HyperLogLog as |
|
| |
Returns HyperLogLog response serialized as |
|
| |
DISTINCTCOUNTSMARTHLL | Returns the count of distinct values of a column as |
|
|
| |||
| |||
Returns the count of distinct values of a column as |
|
| |
LASTWITHTIME(dataColumn, timeColumn, 'dataType') | Get the last value of dataColumn where the timeColumn is used to define the time of dataColumn and the dataType specifies the type of dataColumn, which can be |
|
|
FIRSTWITHTIME(dataColumn, timeColumn, 'dataType') | Get the first value of dataColumn where the timeColumn is used to define the time of dataColumn and the dataType specifies the type of dataColumn, which can be |
|
|
Deprecated functions:
Function | Description | Example |
---|---|---|
FASTHLL | FASTHLL stores serialized HyperLogLog in String format, which performs worse than DISTINCTCOUNTHLL, which supports serialized HyperLogLog in BYTES (byte array) format |
|
Multi-value column functions
The following aggregation functions can be used for multi-value columns
Function |
---|
COUNTMV
Returns the count of a multi-value column as |
MINMV
Returns the minimum value of a numeric multi-value column as |
MAXMV
Returns the maximum value of a numeric multi-value column as |
SUMMV
Returns the sum of the values for a numeric multi-value column as |
AVGMV
Returns the average of the values for a numeric multi-value column as |
MINMAXRANGEMV
Returns the |
PERCENTILEMV(column, N)
Returns the Nth percentile of the values for a numeric multi-value column as |
PERCENTILEESTMV(column, N)
Returns the Nth percentile using Quantile Digest as |
PERCENTILETDIGESTMV(column, N)
Returns the Nth percentile using T-digest as |
DISTINCTCOUNTMV
Returns the count of distinct values for a multi-value column as |
DISTINCTCOUNTBITMAPMV
Returns the count of distinct values for a multi-value column as |
DISTINCTCOUNTHLLMV
Returns an approximate distinct count using HyperLogLog as |
DISTINCTCOUNTRAWHLLMV Returns HyperLogLog response serialized as string. The serialized HLL can be converted back into an HLL and then aggregated with other HLLs. A common use case may be to merge HLL responses from different Pinot tables, or to allow aggregation after client-side batching. |
Deprecated functions:
Function | Description | Example |
---|---|---|
FASTHLLMV (Deprecated) | stores serialized HyperLogLog in String format, which performs worse than DISTINCTCOUNTHLL, which supports serialized HyperLogLog in BYTES (byte array) format |
|
Last updated