Operator Types

Describes the multi-stage operators in general

The multi-stage query engine uses a set of operators to process the query. These operators are based on relational algebra, with some modifications to better fit the distributed nature of the engine.

These operators are the execution units that Pinot uses to execute a query. The operators are executed in a pipeline with tree structure, where each operator consumes the output of the previous operators (also known as upstreams).

Users do not directly specify these operators. Instead they write SQL queries that are translated into a logical plan, which is then transformed into different operators. The logical plan can be obtained using explaining a query, while there is no way to get the operators directly. The closest thing to the operators that users can get is the multi-stage stats output.

Operators vs SQL clauses

These operators are generated from the SQL query that you write, but even they are similar, there is not a one-to-one mapping between the SQL clauses and the operators. Some SQL clauses generate multiple operators, while some operators are generated by multiple SQL clauses.

Operators vs explain plan nodes

Operators and explain plan nodes are closer than SQL clauses and operators. Although most explain plan nodes can be directly mapped to an operator, there are some exceptions:

  • Each PinotLogicalExchange and each PinotLogicalSortExchange explain node is materialized into a pair of mailbox send and mailbox receive operators.

  • All plan nodes that belong to the same leaf stage are executed in the leaf operator.

In general terms, the operators are the execution units that Pinot uses to execute a query and are also known as the multi-stage physical plan, while the explain plan nodes are logical plans. The difference between the two is that the operators can be actually executed, while the explain plan nodes are the logical representation of the query plan.

List of operators

The following is a list of operators that are used by the multi-stage query engine: