Last updated
Was this helpful?
Last updated
Was this helpful?
Query execution within Pinot is modeled as a sequence of operators that are executed in a pipelined manner to produce the final result. The EXPLAIN PLAN FOR
syntax can be used to obtain the execution plan of a query, which can be useful to further optimize them.
The explain plan is a feature that is still under development and may change in future releases. Pinot explain plans are human-readable and are intended to be used for debugging and optimization purposes. This is specially important when using the explain plan in automated scripts or tools. The explain plan, even the ones returned as tables or JSON, are not guaranteed to be stable across releases.
Pinot supports different type of explain plans depending on the query engine and the granularity or details we want to obtain.
Segments are the basic unit of data storage and processing in Pinot. When a query is executed, it is executed on each segment and the results are merged together. Not all segments have the data distribution, indexes, etc. Therefore the query engine may decide to execute the query differently on different segments. This includes:
Segments that were not refreshed since indexes were added or removed on the table config.
Realtime segments that are being ingested, where some indexes (like range indexes) cannot be used.
Data distribution, specially min and max values for columns, which can affect the query plan.
Given a Pinot query can touch thousands of segments, Pinot tries to minimize the number of shown when explaining a query. By default, Pinot tries to analyze the plan for each segment and returns a simplified plan. How this simplification is done depends on the query engine, you can read more about that below.
There is a verbose mode that can be used to show the plan for each segment. This mode is activated by setting the explainPlanVerbose
query option to true, prefixing SET explainPlanVerbose=true;
to the explain plan sentence.
Following the more complex nature of the multi-stage query engine, its explain plan can be customized to get a plan on different aspects of the query execution.
There are 3 different types of explain plans for the multi-stage query engine:
The plan with segments is a detailed representation of the query execution plan that includes the segment specific information, like data distribution, indexes, etc.
This mode was introduced in Pinot 1.3.0 and it is planned to be the default in future releases. Meanwhile it can be used by setting the explainAskingServers
query option to true, prefixing SET explainAskingServers=true;
to the explain plan sentence. Alternatively this mode can be activated by default by changing the broker configuration pinot.query.multistage.explain.include.segment.plan
to true.
Independently of how it is activated, once this mode is enabled, EXPLAIN PLAN FOR
syntax will include segment information.
As explained in Different plans for different segments
, by default Pinot tries to minimize the number of shown when explaining a query. In multi-stage, the brief mode includes all different plans, but each equivalent plan is aggregated. For example, if the same plan is executed on 100 segments, the brief mode will show it only once and stats like the number of docs will be summed.
In the verbose mode, one plan is shown per segment, including the segment name and all the segment specific information. This may be useful to know which segments are not using indexes, or which segments are using a different data distribution.
Returns
The logical plan is a high-level representation of the query execution plan. This plan is calculated on the broker without asking the servers for their segment specific plans. This means that the logical plan does not include the segment specific information, like data distribution, indexes, etc.
In Pinot 1.3.0, the logical plan is enabled by default and can be obtained by using EXPLAIN PLAN FOR
syntax. Optionally, the segment plan can be enabled by default, in which case the logical plan can be obtained by using EXPLAIN PLAN WITHOUT IMPLEMENTATION FOR
syntax.
Returns:
The workers plan is a detailed representation of the query execution plan that includes information on how the query is distributed among different servers and workers inside them. This plan does not include the segment specific information, like data distribution, indexes, etc. and it is probably the useful of the plans for normal use cases.
Their main use case is to try to reduce data shuffling between workers by verifying that, for example, a join is executed in colocated fashion.
Returns:
Explain plan for single stage query engine is simpler and less customized, but returns the information in a tabular format. For example, the query EXPLAIN PLAN FOR SELECT playerID, playerName FROM baseballStats
.
Returns the following table:
Where Operator
column describes the operator that Pinot will run whereas the Operator_Id
and Parent_Id
columns show the parent-child relationship between operators, which forms the execution tree. For example, the plan above should be understood as:
Explain plan for single stage query engine is described in deep in
Segment plan
SET explainAskingServers=true;
EXPLAIN PLAN FOR
EXPLAIN PLAN FOR
Includes the segment specific information (like indexes).
Logical plan
EXPLAIN PLAN FOR
or
EXPLAIN PLAN WITHOUT IMPLEMENTATION FOR
EXPLAIN PLAN WITHOUT IMPLEMENTATION FOR
Simplest multi-stage plan. No index or data shuffle information.
Workers plan
EXPLAIN IMPLEMENTATION PLAN FOR
EXPLAIN IMPLEMENTATION PLAN FOR
Used to understand data shuffle between servers. Note: The name of this mode is open to discussion and may change in the future.