0.11.0

Summary

Apache Pinot 0.11.0 has introduced many new features to extend the query abilities, e.g. the Multi-Stage query engine enables Pinot to do distributed joins, more sql syntax(DML support), query functions and indexes(Text index, Timestamp index) supported for new use cases. And as always, more integrations with other systems(E.g. Spark3, Flink).

circle-info

Note: there is a major upgrade for Apache Helix to 1.0.4, so please make sure you upgrade the system in the order of:

Helix Controller -> Pinot Controller -> Pinot Broker -> Pinot server

Multi-Stage Query Engine

The new multi-stage query engine (a.k.a V2 query engine) is designed to support more complex SQL semantics such as JOIN, OVER window, MATCH_RECOGNIZE and eventually, make Pinot support closer to full ANSI SQL semantics. More to read: https://docs.pinot.apache.org/developers/advanced/v2-multi-stage-query-enginearrow-up-right

Pause Stream Consumption on Apache Pinot

Pinot operators can pause realtime consumption of events while queries are being executed, and then resume consumption when ready to do so again.

More to read: https://medium.com/apache-pinot-developer-blog/pause-stream-consumption-on-apache-pinot-772a971ef403arrow-up-right

Gap-filling function

The gapfilling functions allow users to interpolate data and perform powerful aggregations and data processing over time series data. More to read: https://www.startree.ai/blog/gapfill-function-for-time-series-datasets-in-pinotarrow-up-right

Add support for Spark 3.x (#8560arrow-up-right)

Long waiting feature for segment generation on Spark 3.x.

Similar to the Spark Pinot connector, this allows Flink users to dump data from the Flink application to Pinot.

Show running queries and cancel query by id (#9171arrow-up-right)

This feature allows better fine-grained control on pinot queries.

Timestamp Index (#8343arrow-up-right)

This allows users to have better query performance on the timestamp column for lower granularity. See: https://docs.pinot.apache.org/basics/indexing/timestamp-indexarrow-up-right

Native Text Indices (#8384arrow-up-right)

Wanna search text in realtime? The new text indexing engine in Pinot supports the following capabilities:

  1. New operator: LIKE

  1. New operator: CONTAINS

  1. Native text index, built from the ground up, focusing on Pinot’s time series use cases and utilizing existing Pinot indices and structures(inverted index, bitmap storage).

  2. Real Time Text Index

Read more: https://medium.com/@atri.jiit/text-search-time-series-style-681af37ba42earrow-up-right

Adding DML definition and parse SQL InsertFile (#8557arrow-up-right)

Now you can use INSERT INTO [database.]table FROM FILE dataDirURI OPTION ( k=v ) [, OPTION (k=v)]* to load data into Pinot from a file using Minion. See: https://docs.pinot.apache.org/basics/data-import/from-query-consolearrow-up-right

Deduplication (#8708arrow-up-right)

This feature supports enabling deduplication for realtime tables, via a top-level table config. At a high level, primaryKey (as defined in the table schema) hashes are stored into in-memory data structures, and each incoming row is validated against it. Duplicate rows are dropped.

The expectation while using this feature is for the stream to be partitioned by the primary key, strictReplicaGroup routing to be enabled, and the configured stream consumer type to be low level. These requirements are therefore mandated via table config API's input validations.

Functions support and changes:

The full list of features introduced in this release

Vulnerability fixs

Pinot has resolved all the high-level vulnerabilities issues:

Bug fixs

Last updated

Was this helpful?