DISTINCTCOUNTSMARTHLL
This section contains reference documentation for the DISTINCT_COUNT_SMART_HLL function.
Last updated
Was this helpful?
This section contains reference documentation for the DISTINCT_COUNT_SMART_HLL function.
Last updated
Was this helpful?
DISTINCT_COUNT_SMART_HLL(col[, params])
col
(required): Name of the column to aggregate on.
params
(optional): Semicolon-separated parameter key-value pairs:
threshold
: The threshold to convert the value set into a HyperLogLog (default 100_000).
log2m
: log2m for the HyperLogLog (default 12).
Example: DISTINCT_COUNT_SMART_HLL(col, 'threshold=10000;log2m=8')
These examples are based on the .
DISTINCTCOUNTHLL()
is faster than DISTINCTCOUNT()
if data is pre-aggregated at ingestion or aggregated at a server with enough records. This performance improvement increases when comparing large datasets.
If very few records are pre-aggregated, DISTINCTCOUNTHLL()
will not be as fast as DISTINCTCOUNT()
because the serialized HLL size is larger than sending individual values.
DISTINCTCOUNTHLLPLUS()
provides more precise results than DISTINCTCOUNTHLL()
with the same performance.
DISTINCTCOUNTSMARTHLL()
automatically shifts to HLL when reaching a threshold, and comes with some overhead.