it.unimi.dsi:fastutil:8.2.3
to hold all the unique values.com.clearspring.analytics:stream:2.7.0
as the data structure to hold intermediate results.org.apache.pinot.core.common.ObjectSerDeUtils.HYPER_LOG_LOG_SER_DE.serialize(hyperLogLog)
.org.apache.datasketches:datasketches-java:1.2.0-incubating
to perform distinct counting as well as evaluating set operations.nominalEntries
.lhs <op> rhs
which are applied on rows selected by the where
clause. During intermediate sketch aggregation, sketches from the thetaSketchColumn
that satisfies these predicates are unionized individually. For example, all filtered rows that match country=USA
are unionized into a single sketch. Complex predicates that are created by combining (AND/OR) of individual predicates is supported.SET_DIFF, SET_UNION, SET_INTERSECT
, where DIFF requires two arguments and the UNION/INTERSECT allow more than two arguments.where
clause is responsible for identifying the matching rows. Note, the where clause can be completely independent of the postAggregationExpression
. Once matching rows are identified, each server unionizes all the sketches that match the individual predicates, i.e. country='USA'
, device='mobile'
in this case. Once the broker receives the intermediate sketches for each of these individual predicates from all servers, it performs the final aggregation by evaluating the postAggregationExpression
and returns the final cardinality of the resulting sketch.org.apache.commons.codec.binary
as Hex.decodeHex(stringValue.toCharArray())
.