LogoLogo
release-1.3.0
release-1.3.0
  • Introduction
  • Basics
    • Concepts
      • Pinot storage model
      • Architecture
      • Components
        • Cluster
          • Tenant
          • Server
          • Controller
          • Broker
          • Minion
        • Table
          • Segment
            • Deep Store
            • Segment threshold
            • Segment retention
          • Schema
          • Time boundary
        • Pinot Data Explorer
    • Getting Started
      • Running Pinot locally
      • Running Pinot in Docker
      • Quick Start Examples
      • Running in Kubernetes
      • Running on public clouds
        • Running on Azure
        • Running on GCP
        • Running on AWS
      • Create and update a table configuration
      • Batch import example
      • Stream ingestion example
      • HDFS as Deep Storage
      • Troubleshooting Pinot
      • Frequently Asked Questions (FAQs)
        • General
        • Pinot On Kubernetes FAQ
        • Ingestion FAQ
        • Query FAQ
        • Operations FAQ
    • Import Data
      • From Query Console
      • Batch Ingestion
        • Spark
        • Flink
        • Hadoop
        • Backfill Data
        • Dimension table
      • Stream ingestion
        • Ingest streaming data from Apache Kafka
        • Ingest streaming data from Amazon Kinesis
        • Ingest streaming data from Apache Pulsar
        • Configure indexes
      • Stream ingestion with Upsert
      • Segment compaction on upserts
      • Stream ingestion with Dedup
      • Stream ingestion with CLP
      • File Systems
        • Amazon S3
        • Azure Data Lake Storage
        • HDFS
        • Google Cloud Storage
      • Input formats
        • Complex Type (Array, Map) Handling
        • Complex Type Examples
        • Ingest records with dynamic schemas
      • Reload a table segment
      • Upload a table segment
    • Indexing
      • Bloom filter
      • Dictionary index
      • Forward index
      • FST index
      • Geospatial
      • Inverted index
      • JSON index
      • Native text index
      • Range index
      • Star-tree index
      • Text search support
      • Timestamp index
      • Vector index
    • Release notes
      • 1.3.0
      • 1.2.0
      • 1.1.0
      • 1.0.0
      • 0.12.1
      • 0.12.0
      • 0.11.0
      • 0.10.0
      • 0.9.3
      • 0.9.2
      • 0.9.1
      • 0.9.0
      • 0.8.0
      • 0.7.1
      • 0.6.0
      • 0.5.0
      • 0.4.0
      • 0.3.0
      • 0.2.0
      • 0.1.0
    • Recipes
      • Connect to Streamlit
      • Connect to Dash
      • Visualize data with Redash
      • GitHub Events Stream
  • For Users
    • Query
      • Querying Pinot
      • Query Syntax
        • Aggregation Functions
        • Array Functions
        • Cardinality Estimation
        • Explain Plan (Single-Stage)
        • Filtering with IdSet
        • Funnel Analysis
        • GapFill Function For Time-Series Dataset
        • Grouping Algorithm
        • Hash Functions
        • JOINs
        • Lookup UDF Join
        • Querying JSON data
        • Transformation Functions
        • URL Functions
        • Window Functions
      • Query Options
      • Query Quotas
      • Query using Cursors
      • Multi-stage query
        • Understanding Stages
        • Stats
        • Optimizing joins
        • Join strategies
          • Random + broadcast join strategy
          • Query time partition join strategy
          • Colocated join strategy
          • Lookup join strategy
        • Hints
        • Operator Types
          • Aggregate
          • Filter
          • Join
          • Intersect
          • Leaf
          • Literal
          • Mailbox receive
          • Mailbox send
          • Minus
          • Sort or limit
          • Transform
          • Union
          • Window
        • Stage-Level Spooling
      • User-Defined Functions (UDFs)
      • Explain plan
    • APIs
      • Broker Query API
        • Query Response Format
      • Controller Admin API
      • Controller API Reference
    • External Clients
      • JDBC
      • Java
      • Python
      • Golang
    • Tutorials
      • Use OSS as Deep Storage for Pinot
      • Ingest Parquet Files from S3 Using Spark
      • Creating Pinot Segments
      • Use S3 as Deep Storage for Pinot
      • Use S3 and Pinot in Docker
      • Batch Data Ingestion In Practice
      • Schema Evolution
  • For Developers
    • Basics
      • Extending Pinot
        • Writing Custom Aggregation Function
        • Segment Fetchers
      • Contribution Guidelines
      • Code Setup
      • Code Modules and Organization
      • Dependency Management
      • Update documentation
    • Advanced
      • Data Ingestion Overview
      • Ingestion Aggregations
      • Ingestion Transformations
      • Null value support
      • Use the multi-stage query engine (v2)
      • Advanced Pinot Setup
    • Plugins
      • Write Custom Plugins
        • Input Format Plugin
        • Filesystem Plugin
        • Batch Segment Fetcher Plugin
        • Stream Ingestion Plugin
    • Design Documents
      • Segment Writer API
  • For Operators
    • Deployment and Monitoring
      • Set up cluster
      • Server Startup Status Checkers
      • Set up table
      • Set up ingestion
      • Decoupling Controller from the Data Path
      • Segment Assignment
      • Instance Assignment
      • Rebalance
        • Rebalance Servers
        • Rebalance Brokers
        • Rebalance Tenant
      • Separating data storage by age
        • Using multiple tenants
        • Using multiple directories
      • Pinot managed Offline flows
      • Minion merge rollup task
      • Consistent Push and Rollback
      • Access Control
      • Monitoring
      • Tuning
        • Tuning Default MMAP Advice
        • Real-time
        • Routing
        • Query Routing using Adaptive Server Selection
        • Query Scheduling
      • Upgrading Pinot with confidence
      • Managing Logs
      • OOM Protection Using Automatic Query Killing
      • Pause ingestion based on resource utilization
    • Command-Line Interface (CLI)
    • Configuration Recommendation Engine
    • Tutorials
      • Authentication
        • Basic auth access control
        • ZkBasicAuthAccessControl
      • Configuring TLS/SSL
      • Build Docker Images
      • Running Pinot in Production
      • Kubernetes Deployment
      • Amazon EKS (Kafka)
      • Amazon MSK (Kafka)
      • Monitor Pinot using Prometheus and Grafana
      • Performance Optimization Configurations
      • Segment Operations Throttling
  • Configuration Reference
    • Cluster
    • Controller
    • Broker
    • Server
    • Table
    • Ingestion
    • Schema
    • Ingestion Job Spec
    • Monitoring Metrics
    • Functions
      • ABS
      • ADD
      • ago
      • EXPR_MIN / EXPR_MAX
      • ARRAY_AGG
      • arrayConcatDouble
      • arrayConcatFloat
      • arrayConcatInt
      • arrayConcatLong
      • arrayConcatString
      • arrayContainsInt
      • arrayContainsString
      • arrayDistinctInt
      • arrayDistinctString
      • arrayIndexOfInt
      • arrayIndexOfString
      • ARRAYLENGTH
      • arrayRemoveInt
      • arrayRemoveString
      • arrayReverseInt
      • arrayReverseString
      • arraySliceInt
      • arraySliceString
      • arraySortInt
      • arraySortString
      • arrayUnionInt
      • arrayUnionString
      • AVGMV
      • Base64
      • caseWhen
      • ceil
      • CHR
      • codepoint
      • concat
      • count
      • COUNTMV
      • COVAR_POP
      • COVAR_SAMP
      • day
      • dayOfWeek
      • dayOfYear
      • DISTINCT
      • DISTINCTAVG
      • DISTINCTAVGMV
      • DISTINCTCOUNT
      • DISTINCTCOUNTBITMAP
      • DISTINCTCOUNTBITMAPMV
      • DISTINCTCOUNTHLL
      • DISTINCTCOUNTSMARTHLL
      • DISTINCTCOUNTHLLPLUS
      • DISTINCTCOUNTHLLMV
      • DISTINCTCOUNTMV
      • DISTINCTCOUNTRAWHLL
      • DISTINCTCOUNTRAWHLLMV
      • DISTINCTCOUNTRAWTHETASKETCH
      • DISTINCTCOUNTTHETASKETCH
      • DISTINCTCOUNTULL
      • DISTINCTSUM
      • DISTINCTSUMMV
      • DIV
      • DATETIMECONVERT
      • DATETRUNC
      • exp
      • FIRSTWITHTIME
      • FLOOR
      • FrequentLongsSketch
      • FrequentStringsSketch
      • FromDateTime
      • FromEpoch
      • FromEpochBucket
      • FUNNELCOUNT
      • FunnelCompleteCount
      • FunnelMaxStep
      • FunnelMatchStep
      • Histogram
      • hour
      • isSubnetOf
      • JSONFORMAT
      • JSONPATH
      • JSONPATHARRAY
      • JSONPATHARRAYDEFAULTEMPTY
      • JSONPATHDOUBLE
      • JSONPATHLONG
      • JSONPATHSTRING
      • jsonextractkey
      • jsonextractscalar
      • LAG
      • LASTWITHTIME
      • LEAD
      • length
      • ln
      • lower
      • lpad
      • ltrim
      • max
      • MAXMV
      • MD5
      • millisecond
      • min
      • minmaxrange
      • MINMAXRANGEMV
      • MINMV
      • minute
      • MOD
      • mode
      • month
      • mult
      • now
      • percentile
      • percentileest
      • percentileestmv
      • percentilemv
      • percentiletdigest
      • percentiletdigestmv
      • percentilekll
      • percentilerawkll
      • percentilekllmv
      • percentilerawkllmv
      • quarter
      • regexpExtract
      • regexpReplace
      • remove
      • replace
      • reverse
      • round
      • roundDecimal
      • ROW_NUMBER
      • rpad
      • rtrim
      • second
      • SEGMENTPARTITIONEDDISTINCTCOUNT
      • sha
      • sha256
      • sha512
      • sqrt
      • startswith
      • ST_AsBinary
      • ST_AsText
      • ST_Contains
      • ST_Distance
      • ST_GeogFromText
      • ST_GeogFromWKB
      • ST_GeometryType
      • ST_GeomFromText
      • ST_GeomFromWKB
      • STPOINT
      • ST_Polygon
      • strpos
      • ST_Union
      • SUB
      • substr
      • sum
      • summv
      • TIMECONVERT
      • timezoneHour
      • timezoneMinute
      • ToDateTime
      • ToEpoch
      • ToEpochBucket
      • ToEpochRounded
      • TOJSONMAPSTR
      • toGeometry
      • toSphericalGeography
      • trim
      • upper
      • Url
      • UTF8
      • VALUEIN
      • week
      • year
      • Extract
      • yearOfWeek
      • FIRST_VALUE
      • LAST_VALUE
      • ST_GeomFromGeoJSON
      • ST_GeogFromGeoJSON
      • ST_AsGeoJSON
    • Plugin Reference
      • Stream Ingestion Connectors
      • VAR_POP
      • VAR_SAMP
      • STDDEV_POP
      • STDDEV_SAMP
    • Dynamic Environment
  • Reference
    • Single-stage query engine (v1)
    • Multi-stage query engine (v2)
    • Troubleshooting
      • Troubleshoot issues with the multi-stage query engine (v2)
      • Troubleshoot issues with ZooKeeper znodes
  • RESOURCES
    • Community
    • Team
    • Blogs
    • Presentations
    • Videos
  • Integrations
    • Tableau
    • Trino
    • ThirdEye
    • Superset
    • Presto
    • Spark-Pinot Connector
  • Contributing
    • Contribute Pinot documentation
    • Style guide
Powered by GitBook
On this page
  • Explain Logical Plan
  • Explain Implementation Plan

Was this helpful?

Export as PDF
  1. For Users
  2. Query
  3. Query Syntax

Explain Plan (Multi-Stage)

This document describes EXPLAIN PLAN syntax for multi-stage engine (v2)

Was this helpful?

This page is deprecated. Please read Explain planto get a better understanding of the different explain plans supported by Pinot.

This page explains how to use EXPLAIN PLAN FOR syntax to obtain different plans of a query in multi-stage engine. You can read more about how to interpret the plans in the page.

Also remember that plans are logical representations of the query execution. Sometimes it is more useful to study the actual stats of the query execution, which are included on each query result. You can read more about how to interpret the stats in the page.

In , we do not differentiate any logical/physical plan b/c the structure of the query is fixed. By default it explain the Physical Plan

In multi-stage engine we support EXPLAIN PLAN syntax mostly following Apache Calcite's syntax. Here are several examples:

Explain Logical Plan

Using SSB standard query example:

EXPLAIN PLAN FOR 
select 
  P_BRAND1, sum(LO_REVENUE) 
from ssb_lineorder_1, ssb_part_1
where LO_PARTKEY = P_PARTKEY 
  and P_CATEGORY = 'MFGR#12' 
group by P_BRAND1

The result field contains 2 columns and 1 row:

+-----------------------------------|-------------------------------------------------------------|
| SQL#$%0                           |PLAN#$%1                                                     |
+-----------------------------------|-------------------------------------------------------------|
|"EXPLAIN PLAN FOR                  |"Execution Plan                                              | 
|select                             |LogicalAggregate(group=[{0}], agg#0=[$SUM0($1)])             | 
|  P_BRAND1, sum(LO_REVENUE)        |  PinotLogicalExchange(distribution=[hash[0]])               | 
|from ssb_lineorder_1, ssb_part_1   |    LogicalAggregate(group=[{2}], agg#0=[$SUM0($1)])         | 
|where LO_PARTKEY = P_PARTKEY       |      LogicalJoin(condition=[=($0, $3)], joinType=[inner])   | 
|  and P_CATEGORY = 'MFGR#12'       |        PinotLogicalExchange(distribution=[hash[0]])         | 
|group by P_BRAND1                  |          LogicalProject(LO_PARTKEY=[$12], LO_REVENUE=[$14]) | 
|   and P_CATEGORY = 'MFGR#12'      |            LogicalTableScan(table=[[ssb_lineorder_1]])      | 
|"                                  |        PinotLogicalExchange(distribution=[hash[1]])         | 
|                                   |          LogicalProject(P_BRAND1=[$3], P_PARTKEY=[$9])      | 
|                                   |            LogicalFilter(condition=[=($4, 'MFGR#12')])      | 
|                                   |              LogicalTableScan(table=[[ssb_part_1]])         |
|                                   |"                                                            |
+-----------------------------------|-------------------------------------------------------------|

noted that all the normal options for EXPLAIN PLAN in Apache Calcite also works in Pinot with extra information including attributes, type, etc.

One of the most useful options is the AS <format>, which support the following formats:

  • JSON, which returns the plan in a JSON format. This format is useful for parsing the plan in a program and it also provides some extra information that is not present in the default format.

  • XML, which is similar to JSON but in XML format.

Explain Implementation Plan

If we want to gather the implementation plan specific to Pinot internal multi-stage engine operator chain. You can use the EXPLAIN IMPLEMENTATION PLAN :

+-----------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------|
| SQL#$%0                           |PLAN#$%1                                                                                                                                                         |  
+-----------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------|
|"EXPLAIN IMPLEMENTATION PLAN FOR   |[0]@local:8843 MAIL_RECEIVE(BROADCAST_DISTRIBUTED)                                                                                                               | 
|select                             |├── [1]@local:8432 MAIL_SEND(BROADCAST_DISTRIBUTED)->{[0]@local@{8843,8843}|[0]} (Subtree Omitted)                                                               | 
|  P_BRAND1, sum(LO_REVENUE)        |├── [1]@local:8432 MAIL_SEND(BROADCAST_DISTRIBUTED)->{[0]@local@{8843,8843}|[0]} (Subtree Omitted)                                                               | 
|from ssb_lineorder_1, ssb_part_1   |└── [1]@local:8432 MAIL_SEND(BROADCAST_DISTRIBUTED)->{[0]@local@{8843,8843}|[0]}                                                                                 | 
|where LO_PARTKEY = P_PARTKEY       |    └── [1]@local:8432 AGGREGATE_FINAL                                                                                                                           | 
|  and P_CATEGORY = 'MFGR#12'       |        └── [1]@local:8432 MAIL_RECEIVE(HASH_DISTRIBUTED)                                                                                                        | 
|group by P_BRAND1                  |            ├── [2]@local:8432 MAIL_SEND(HASH_DISTRIBUTED)->{[1]@local@{8432,8843}|[1],[1]@local@{8432,8843}|[2],[1]@local@{8432,8843}|[0]} (Subtree Omitted)    | 
|   and P_CATEGORY = 'MFGR#12'      |            ├── [2]@local:8432 MAIL_SEND(HASH_DISTRIBUTED)->{[1]@local@{8432,8843}|[1],[1]@local@{8432,8843}|[2],[1]@local@{8432,8843}|[0]} (Subtree Omitted)    | 
|"                                  |            └── [2]@local:8432 MAIL_SEND(HASH_DISTRIBUTED)->{[1]@local@{8432,8843}|[1],[1]@local@{8432,8843}|[2],[1]@local@{8432,8843}|[0]}                      | 
|                                   |                └── [2]@local:8432 AGGREGATE_LEAF                                                                                                                | 
|                                   |                    └── [2]@local:8432 JOIN                                                                                                                      | 
|                                   |                        ├── [2]@local:8432 MAIL_RECEIVE(HASH_DISTRIBUTED)                                                                                        | 
|                                   |                        │   ├── [3]@local:8432 MAIL_SEND(HASH_DISTRIBUTED)->{[2]@local@{8432,8843}|[1],[2]@local@{8432,8843}|[2],[2]@local@{8432,8843}|[0]}      | 
|                                   |                        │   │   └── [3]@local:8432 PROJECT                                                                                                       | 
|                                   |                        │   │       └── [3]@local:8432 TABLE SCAN (ssb_lineorder_1) null                                                                         | 
|                                   |                        │   ├── [3]@local:8432 MAIL_SEND(HASH_DISTRIBUTED)->{[2]@local@{8432,8843}|[1],[2]@local@{8432,8843}|[2],[2]@local@{8432,8843}|[0]}      | 
|                                   |                        │   │   └── [3]@local:8432 PROJECT                                                                                                       | 
|                                   |                        │   │       └── [3]@local:8432 TABLE SCAN (ssb_lineorder_1) null                                                                         | 
|                                   |                        │   └── [3]@local:8432 MAIL_SEND(HASH_DISTRIBUTED)->{[2]@local@{8432,8843}|[1],[2]@local@{8432,8843}|[2],[2]@local@{8432,8843}|[0]}      | 
|                                   |                        │       └── [3]@local:8432 PROJECT                                                                                                       | 
|                                   |                        │           └── [3]@local:8432 TABLE SCAN (ssb_lineorder_1) null                                                                         | 
|                                   |                        └── [2]@local:8432 MAIL_RECEIVE(HASH_DISTRIBUTED)                                                                                        | 
|                                   |                            ├── [4]@local:8432 MAIL_SEND(HASH_DISTRIBUTED)->{[2]@local@{8432,8843}|[1],[2]@local@{8432,8843}|[2],[2]@local@{8432,8843}|[0]}      | 
|                                   |                            │   └── [4]@local:8432 PROJECT                                                                                                       | 
|                                   |                            │       └── [4]@local:8432 FILTER                                                                                                    | 
|                                   |                            │           └── [4]@local:8432 TABLE SCAN (ssb_part_1) null                                                                          | 
|                                   |                            ├── [4]@local:8432 MAIL_SEND(HASH_DISTRIBUTED)->{[2]@local@{8432,8843}|[1],[2]@local@{8432,8843}|[2],[2]@local@{8432,8843}|[0]}      | 
|                                   |                            │   └── [4]@local:8432 PROJECT                                                                                                       | 
|                                   |                            │       └── [4]@local:8432 FILTER                                                                                                    | 
|                                   |                            │           └── [4]@local:8432 TABLE SCAN (ssb_part_1) null                                                                          | 
|                                   |                            └── [4]@local:8432 MAIL_SEND(HASH_DISTRIBUTED)->{[2]@local@{8432,8843}|[1],[2]@local@{8432,8843}|[2],[2]@local@{8432,8843}|[0]}      | 
|                                   |                                └── [4]@local:8432 PROJECT                                                                                                       | 
|                                   |                                    └── [4]@local:8432 FILTER                                                                                                    | 
|                                   |                                        └── [4]@local:8432 TABLE SCAN (ssb_part_1) null                                                                          | 
+-----------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------|

Notes that now there is information regarding how many servers were used, and how are data being shuffled between nodes. etc.

DOT, which returns a DOT format that can be used to visualize the plan using tools like . This format is understandable by different tools, including online stateless pages.

Understanding multi-stage explain plans
Understanding multi-stage stats
Single-stage engine Explain Plan
EXPLAIN PLAN
Graphviz