githubEdit

Performance Tuning

Reduce query latency, control resource usage, and scale throughput by tuning routing, scheduling, memory, and real-time ingestion.

Purpose

Pinot query costs vary widely depending on workload characteristics, data distribution, and cluster topology. This section provides the operator-facing knobs for improving latency, throughput, and resource efficiency across a Pinot cluster. Use these guides after your tables are ingesting data and you have baseline metrics to compare against.

Tuning categories

Query routing and fanout

Control how brokers select servers and how many servers participate in each query. Reducing unnecessary fanout is the single highest-leverage tuning action for tail latency.

Page
What it covers

Partition pruning, time pruning, replica-group routing, single-replica routing, preferred pool routing, broker tag enforcement

Route queries to the fastest available server using latency and in-flight request stats

Broker-side and server-side pruning strategies (time, partition, bloom filter, column value) to skip irrelevant segments

Query scheduling and resource isolation

Prioritize production traffic over ad-hoc queries and prevent noisy-neighbor problems.

Page
What it covers

FCFS, bounded FCFS, and token-bucket schedulers for controlling query concurrency on servers

Binary workload and named-workload schedulers with per-workload CPU and memory budgets

Heap monitoring and automatic query killing to prevent server out-of-memory crashes

Memory and storage

Tune how Pinot maps segment data into memory and controls disk I/O.

Page
What it covers

Configure posix_madvise hints (RANDOM, SEQUENTIAL, WILL_NEED) for memory-mapped segment files

Predicate reordering, streaming segment download, Netty native TLS and transport

Limit parallelism of segment download, index rebuild, and StarTree preprocessing to protect server resources

Real-time ingestion tuning

Optimize memory, throughput, and segment sizing for tables consuming from streaming sources.

Page
What it covers

Off-heap allocation, consuming segment row thresholds, completed-segment placement, split commit protocol, RealtimeProvisioningHelper

Eliminate ingestion pauses during segment commit by consuming into a new segment in parallel with build and upload

Automatically pause and resume real-time ingestion when disk utilization exceeds a threshold

Where to start

  1. Measure first. Collect baseline query latency (p50, p95, p99), numSegmentsQueried, numDocsScanned, and server CPU/memory utilization before changing any configuration.

  2. Reduce fanout. Enable time and partition pruning, and consider replica-group routing if your table spans many servers.

  3. Right-size consuming segments. Use the RealtimeProvisioningHelper to choose optimal segment sizes and flush thresholds for real-time tables.

  4. Protect production traffic. Enable adaptive server selection and consider workload isolation if ad-hoc queries share the same cluster.

  5. Tune memory mapping. On Linux, experiment with RANDOM madvise if your workload is point-lookup heavy, or keep the default if scans dominate.

Last updated

Was this helpful?