Query Routing using Adaptive Server Selection
Adaptive Server Selection is a new routing capability for Pinot Brokers where incoming queries are routed to the best available server instead of following the default round robin approach while choosing servers. With this feature, Brokers will be sensitive to changes on the Servers like GC issues, slowness, network slowness, etc. The broker will thus adaptively route more queries to faster servers and lesser queries to slower servers
How this works
There are two main components:
Stats Collection
Routing using Adaptive Server Selection
Stats Collection
Each broker maintains stats individually for all servers. These stats are collected at the broker during query processing when the query is routed to the servers and after the response is received from the servers. These stats are maintained in-memory. Some of the stats collected at broker per server are as follows:
Number of in-progress / in-flight queries
EWMA (Exponential Weighted Moving Average) for latencies seen by queries
EWMA (Exponential Weighted Moving Average) for number of ongoing queries at any time
Adaptive Routing
When the broker receives a query, it will use the above stats to pick the best available server. This enables the broker to automatically reduces the number of queries it sends to slow servers and increase the number of queries it sends to faster servers. We currently support the following strategies:
NO_OP : Uses the default RoundRobin approach. In other words, this will give existing behavior where stats are not used by broker when picking the servers to route the query to.
NUM_INFLIGHT_REQ : Uses the number of in-flight requests stat to determine the best server
LATENCY : Uses the EWMA latency stat to determine the best server
HYBRID : Uses a combination of in-flight requests and latency to determine the best server
The above strategies works in tandem with the following available Routing mechanisms today:
Balanced Routing
ReplicaGroup Routing
So, a table can be configured to use Balanced or Replica group segment assignment + routing and can still leverage the adaptive server selection feature.
Configs
The configuration for enabling/disabling this feature and the knobs for performance tuning are present at the Broker instance level. The feature is currently turned off by default.
Enabling Stats Collection and Adaptive Routing
To enable Stats Collection, set
pinot.broker.adaptive.server.selector.enable.stats.collection = true
. Note that setting this property alone will only enable stats collection and not perform Adaptive RoutingTo enable an Adaptive Routing Strategy, use one of the following configs. The
HYBRID
strategy works well for most use cases. Unless you are an advanced user, we recommend using theHYBRID
strategy.pinot.broker.adaptive.server.selector.type=HYBRID
pinot.broker.adaptive.server.selector.type=NUM_INFLIGHT_REQ
pinot.broker.adaptive.server.selector.type=LATENCY
Tuning Knobs
The following configs are already set to default values that work well for most usecases. For advanced users, the following knobs are available to tune Adaptive Routing Strategies
Prefix all the below properties with pinot.broker.adaptive.server.selector.