Running Pinot in Production
Requirements
You will need the following in order to run pinot in production:
Hardware for controller/broker/servers as per your load
Working installation of Zookeeper that Pinot can use. We recommend setting aside a path within zookpeer and including that path in pinot.controller.zkStr. Pinot will create its own cluster under this path (cluster name decided by pinot.controller.helixClusterName)
Shared storage mounted on controllers (if you plan to have multiple controllers for the same cluster). Alternatively, an implementation of PinotFS that the Pinot hosts have access to.
HTTP load balancers for spraying queries across brokers (or other mechanism to balance queries)
HTTP load balancers for spraying controller requests (e.g. segment push, or other controller APIs) or other mechanisms for distribution of these requests.
Deploying Pinot
In general, when deploying Pinot services, it is best to adhere to a specific ordering in which the various components should be deployed. This deployment order is recommended in case of the scenario that there might be protocol or other significant differences, the deployments go out in a predictable order in which failure due to these changes can be avoided.
The ordering is as follows:
pinot-controller
pinot-broker
pinot-server
pinot-minion
Managing Pinot
Pinot provides a web-based management console and a command-line utility (pinot-admin.sh
) in order to help provision and manage pinot clusters.
Pinot Management Console
The web based management console allows operations on tables, tenants, segments and schemas. You can access the console via http://controller-host:port/help
. The console also allows you to enter queries for interactive debugging. Here are some screen-shots from the console.
Listing all the schemas in the Pinot cluster:
Rebalancing segments of a table:
Command line utility (pinot-admin.sh)
The command line utility (pinot-admin.sh
) can be generated by running mvn install package -DskipTests -Pbin-dist
in the directory in which you checked out Pinot.
Here is an example of invoking the command to create a pinot segment:
Here is an example of executing a query on a Pinot table:
Monitoring Pinot
Pinot exposes several metrics to monitor the service and ensure that pinot users are not experiencing issues. In this section we discuss some of the key metrics that are useful to monitor. A full list of metrics is available in the Metrics section.
Pinot Server
Missing Segments - NUM_MISSING_SEGMENTS
Number of missing segments that the broker queried for (expected to be on the server) but the server didn’t have. This can be due to retention or stale routing table.
Query latency - TOTAL_QUERY_TIME
Total time to take from receiving to finishing executing the query.
Query Execution Exceptions - QUERY_EXECUTION_EXCEPTIONS
The number of exception which might have occurred during query execution.
Real-time Consumption Status - LLC_PARTITION_CONSUMING
This gives a binary value based on whether low-level consumption is healthy (1) or unhealthy (0). It’s important to ensure at least a single replica of each partition is consuming.
Real-time Highest Offset Consumed - HIGHEST_STREAM_OFFSET_CONSUMED
The highest offset which has been consumed so far.
Pinot Broker
Incoming queries - QUERIES
Queries received by a broker.
Access denied - REQUEST_DROPPED_DUE_TO_ACCESS_ERROR
Queries dropped due to access restrictions.
Partial Responses
Unavailable segments - BROKER_RESPONSES_WITH_UNAVAILABLE_SEGMENTS
Queries with some segments unavailable (no active server hosting them).
Partial servers responded- BROKER_RESPONSES_WITH_PARTIAL_SERVERS_RESPONDED
Queries with not all servers responded.
Processing exceptions - BROKER_RESPONSES_WITH_PROCESSING_EXCEPTIONS
Queries with processing exceptions (including server side exceptions, unavailable segments, partial servers responded).
Groups limit reached - BROKER_RESPONSES_WITH_NUM_GROUPS_LIMIT_REACHED
Queries with
numGroupsLimit
reached. See Grouping Algorithm for more details.
Table QPS quota exceeded - QUERY_QUOTA_EXCEEDED
Binary metric which will indicate when the configured QPS quota for a table is exceeded (1) or if there is capacity remaining (0).
Table QPS quota usage percent - QUERY_QUOTA_CAPACITY_UTILIZATION_RATE
Percentage of the configured QPS quota being utilized.
Pinot Controller
Many of the controller metrics include a table name and thus are dynamically generated in the code. The metrics below point to the classes which generate the corresponding metrics.
To get the real metric name, the easiest route is to spin up a controller instance, create a table with the desired name and look through the generated metrics.
Todo
Give a more detailed explanation of how metrics are generated, how to identify real metrics names and where to find them in the code.
Percent Segments Available - PERCENT_SEGMENTS_AVAILABLE
Percentage of complete online replicas in external view as compared to replicas in ideal state.
Segments in Error State - SEGMENTS_IN_ERROR_STATE
Number of segments in an
ERROR
state for a given table.
Last push delay - Generated in the ValidationMetrics class.
The time in hours since the last time an offline segment has been pushed to the controller.
Percent of replicas up - PERCENT_OF_REPLICAS
Percentage of complete online replicas in external view as compared to replicas in ideal state.
Table storage quota usage percent - TABLE_STORAGE_QUOTA_UTILIZATION
Shows how much of the table’s storage quota is currently being used, metric will a percentage of a the entire quota.