# Performance Tuning

## Purpose

Pinot query costs vary widely depending on workload characteristics, data distribution, and cluster topology. This section provides the operator-facing knobs for improving latency, throughput, and resource efficiency across a Pinot cluster. Use these guides after your tables are ingesting data and you have baseline metrics to compare against.

## Tuning categories

### Query routing and fanout

Control how brokers select servers and how many servers participate in each query. Reducing unnecessary fanout is the single highest-leverage tuning action for tail latency.

| Page                                                                                                                          | What it covers                                                                                                                 |
| ----------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------ |
| [Optimizing Scatter and Gather](https://docs.pinot.apache.org/operate-pinot/tuning/routing)                                   | Partition pruning, time pruning, replica-group routing, single-replica routing, preferred pool routing, broker tag enforcement |
| [Adaptive Server Selection](https://docs.pinot.apache.org/operate-pinot/tuning/query-routing-using-adaptive-server-selection) | Route queries to the fastest available server using latency and in-flight request stats                                        |
| [Segment Pruning](https://docs.pinot.apache.org/operate-pinot/tuning/segment-pruning)                                         | Broker-side and server-side pruning strategies (time, partition, bloom filter, column value) to skip irrelevant segments       |

### Query scheduling and resource isolation

Prioritize production traffic over ad-hoc queries and prevent noisy-neighbor problems.

| Page                                                                                                              | What it covers                                                                               |
| ----------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------- |
| [Query Scheduling](https://docs.pinot.apache.org/operate-pinot/tuning/query-scheduling)                           | FCFS, bounded FCFS, and token-bucket schedulers for controlling query concurrency on servers |
| [Workload-Based Query Isolation](https://docs.pinot.apache.org/operate-pinot/tuning/workload-query-isolation)     | Binary workload and named-workload schedulers with per-workload CPU and memory budgets       |
| [OOM Protection](https://docs.pinot.apache.org/operate-pinot/tuning/oom-protection-using-automatic-query-killing) | Heap monitoring and automatic query killing to prevent server out-of-memory crashes          |

### Memory and storage

Tune how Pinot maps segment data into memory and controls disk I/O.

| Page                                                                                                                                  | What it covers                                                                                               |
| ------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------ |
| [Tuning Default MMAP Advice](https://docs.pinot.apache.org/operate-pinot/tuning/tuning-default-mmap-advice)                           | Configure `posix_madvise` hints (RANDOM, SEQUENTIAL, WILL\_NEED) for memory-mapped segment files             |
| [Performance Optimization Configurations](https://docs.pinot.apache.org/operate-pinot/tuning/performance-optimization-configurations) | Predicate reordering, streaming segment download, Netty native TLS and transport                             |
| [Segment Operations Throttling](https://docs.pinot.apache.org/operate-pinot/tuning/segment-operations-throttling)                     | Limit parallelism of segment download, index rebuild, and StarTree preprocessing to protect server resources |

### Real-time ingestion tuning

Optimize memory, throughput, and segment sizing for tables consuming from streaming sources.

| Page                                                                                                                                              | What it covers                                                                                                                        |
| ------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------- |
| [Tuning Real-time Performance](https://docs.pinot.apache.org/operate-pinot/tuning/realtime)                                                       | Off-heap allocation, consuming segment row thresholds, completed-segment placement, split commit protocol, RealtimeProvisioningHelper |
| [Pauseless Consumption](https://docs.pinot.apache.org/operate-pinot/tuning/pauseless-consumption)                                                 | Eliminate ingestion pauses during segment commit by consuming into a new segment in parallel with build and upload                    |
| [Pause Ingestion Based on Resource Utilization](https://docs.pinot.apache.org/operate-pinot/tuning/pause-ingestion-based-on-resource-utilization) | Automatically pause and resume real-time ingestion when disk utilization exceeds a threshold                                          |

## Where to start

1. **Measure first.** Collect baseline query latency (p50, p95, p99), `numSegmentsQueried`, `numDocsScanned`, and server CPU/memory utilization before changing any configuration.
2. **Reduce fanout.** Enable time and partition pruning, and consider replica-group routing if your table spans many servers.
3. **Right-size consuming segments.** Use the `RealtimeProvisioningHelper` to choose optimal segment sizes and flush thresholds for real-time tables.
4. **Protect production traffic.** Enable adaptive server selection and consider workload isolation if ad-hoc queries share the same cluster.
5. **Tune memory mapping.** On Linux, experiment with `RANDOM` madvise if your workload is point-lookup heavy, or keep the default if scans dominate.
