Querying Pinot
Learn how to query Pinot using SQL
SQL Interface
Pinot provides SQL interface for querying. It uses the Calcite SQL parser to parse queries and uses MYSQL_ANSI dialect. You can see the grammar in the Calcite documentation.
Limitations
Pinot does not support joins or nested subqueries. We recommend using Presto for queries that span multiple tables. For more information, see Engineering Full SQL support for Pinot at Uber.
There is no DDL support. Tables can be created via the REST API.
Identifier vs Literal
In Pinot SQL:
Double quotes(") are used to force string identifiers, e.g. column names
Single quotes(') are used to enclose string literals. If the string literal also contains a single quote, escape this with a single quote e.g
'''Pinot'''
to match the string literal'Pinot'
Mis-using those might cause unexpected query results:
e.g.
WHERE a='b'
means the predicate on the columna
equals to a string literal value'b'
WHERE a="b"
means the predicate on the columna
equals to the value of the columnb
If your column names use reserved keywords (e.g. timestamp
or date
) or special charactesr, you will need to use double quotes when referring to them in queries.
Note: Defining decimal literals within quotes preserves precision.
Example Queries
Selection
Aggregation
Grouping on Aggregation
Ordering on Aggregation
Filtering
For performant filtering of ids in a list, see Filtering with IdSet.
Filtering with NULL predicate
Selection (Projection)
Ordering on Selection
Pagination on Selection
Results might not be consistent if the order by column has the same value in multiple rows.
Wild-card match (in WHERE clause only)
To count rows where the column airlineName
starts with U
Case-When Statement
Pinot supports the CASE-WHEN-ELSE statement.
Example 1:
Example 2:
UDF
Functions have to be implemented within Pinot. Injecting functions is not yet supported. The example below demonstrate the use of UDFs.
For more examples, see Transform Function in Aggregation Grouping.
BYTES column
Pinot supports queries on BYTES column using HEX string. The query response also uses HEX string to represent bytes values.
e.g. the query below fetches all the rows for a given UID.
Last updated