Explain

Learn more about multi-stage explain plans and how to interpret them.

Multi-stage plans are a bit more complex than single-stage plans. This page explains how to interpret multi-stage explain plans.

As explained in Explaining multi-stage queries, you can use the EXPLAIN PLAN syntax to obtain the logical plan of a query. There are different formats for the output of the EXPLAIN PLAN command, but all of them represent the logical plan of the query.

The query

explain plan for
select customer.c_address, orders.o_shippriority
from customer
join orders
    on customer.c_custkey = orders.o_custkey
limit 10

Can produce the following output:

LogicalSort(offset=[0], fetch=[10])
  PinotLogicalSortExchange(distribution=[hash], collation=[[]], isSortOnSender=[false], isSortOnReceiver=[false])
    LogicalSort(fetch=[10])
      LogicalProject(c_address=[$0], o_shippriority=[$3])
        LogicalJoin(condition=[=($1, $2)], joinType=[inner])
          PinotLogicalExchange(distribution=[hash[1]])
            LogicalProject(c_address=[$4], c_custkey=[$6])
              LogicalTableScan(table=[[default, customer]])
          PinotLogicalExchange(distribution=[hash[0]])
            LogicalProject(o_custkey=[$5], o_shippriority=[$10])
              LogicalTableScan(table=[[default, orders]])

We can see that each node in the tree represents an operation that is executed in the query and each operator has some attributes. For example the LogicalJoin operator has a condition attribute that specifies the join condition and a joinType. Although some of the attributes shown are easy to understand, some of them may require a bit more explanation.

In our example we can see that the LogicalTableScan operator has a table attribute that indicates the table being scanned. The table is represented as a list with two elements: the first one is the schema name (default by default) and the second one is the table name. Attributes like offset and fetch in the LogicalSort operator are also easy to understand. But once we start to see expressions and references like $2 things start to be more complex.

These indexed references are used to reference the positions into the input row for each operator. In order to understand these references we need to look at the operator's children and see which attributes are being referenced. That usually requires going to the leaf operators and seeing which attributes are being generated.

For example, the LogicalTableScan always returns the whole row of the table, so the attributes are the columns of the table. In our example:

         PinotLogicalExchange(distribution=[hash[0]])
            LogicalProject(o_custkey=[$5], o_shippriority=[$10])
              LogicalTableScan(table=[[default, orders]])

We can see that the result of the LogicalTableScan operator is processed by a LogicalProject operator that is selecting the columns o_custkey and o_shippriority. This LogicalProject operator is generating a row with two columns. $5 and $10 are the indexes of the column o_custkey and o_shippriority in the row generated by the LogicalTableScan. Then we can see a PinotLogicalExchange operator that is sending the result to the LogicalJoin operator in the stage downstream. That PinotLogicalExcange is distributing the rows using hash[0], which means to use the hash of the first column returned by LogicalProject. As we saw before, that first column is the o_custkey column, so the rows are distributed by the o_custkey column.

The LogicalJoin operator is receiving the rows from the two stages upstream. It is not clearly said anywhere, but the virtual row seen by the join operator is the concatenation of the rows sent by the first stage (aka left hand size) plus the rows sent by the second stage (aka right hand side).

The first stage is sending the c_address and c_custkey columns and the second stage is sending the o_custkey and o_shippriority columns. Therefore the join operator is consuming a row with the columns [c_address, c_custkey, o_custkey, o_shippriority]. The LogicalJoin operator is joining the rows using the condition =($1, $2), which means that it is joining the rows using the c_custkey and o_custkey columns and comparing them by equality. LogicalJoin can generate new rows, but does not modify the virtual columns. Therefore this join is sending rows with the columns [c_address, c_custkey, o_custkey, o_shippriority] to its downstream.

This downstream is the LogicalProject operator that is selecting the columns $0 and $3 from the rows sent by the join operator. Therefore the resulting row contains the columns c_address and o_shippriority.

The rest of the operators are easier to read. Something that can be surprising is the LogicalSort operator. In the SQL query used as example there was no order by, but the LogicalSort operator is present in the plan. This is because in relational algebra a sort is always needed to limit the rows. In this case the LogicalSort operator is limiting the rows to 10 without specifying a sort condition, so it is not really sorting the rows (which may be expensive). The corollary is that a LogicalSort operator does not imply that an actual sort is being executed.

Last updated