# Broker

Brokers handle Pinot queries. They **accept queries from clients and forward them to the right servers**. They collect results back from the servers and **consolidate them into a single response**, to **send back to the client**.

![Broker interaction with other components](https://2944557471-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-LtH6nl58DdnZnelPdTc%2F-M1c68aqq6AOJfeKiQKl%2F-M1c97qmI9TI8SSD0-5a%2FBroker%20\(1\).jpg?alt=media\&token=5377fedd-7f4e-4701-a65a-2a45175bdbf3)

Pinot Brokers are modeled as Helix **Spectators**. They need to know the location of each segment of a table (and each replica of the segments) and route requests to the appropriate server that hosts the segments of the table being queried.&#x20;

The broker ensures that all the rows of the table are queried exactly once so as to return correct, consistent results for a query. The brokers may optimize to **prune some of the segments** as long as accuracy is not sacrificed.&#x20;

Helix provides the framework by which spectators can learn the location in which each partition of a resource (*i.e.* participant) resides. The brokers use this mechanism to learn the servers that host specific segments of a table.

In the case of hybrid tables, the brokers ensure that the overlap between real-time and offline segment data is queried exactly once, by performing **offline and real-time federation**.&#x20;

\
Let's take this example, we have real-time data for 5 days - March 23 to March 27, and offline data has been pushed until Mar 25, which is 2 days behind real-time. The brokers maintain this time boundary.&#x20;

![](https://2944557471-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-LtH6nl58DdnZnelPdTc%2F-M1XsRzA2B-iGY91uNau%2F-M1Y6WPgBfIM-iC7cHq3%2FTimeBoundary.jpg?alt=media\&token=e0983f8b-b14d-48ac-a55b-12cd2664a551)

Suppose, we get a query to this table : `select sum(metric) from table`. The broker will split the query into 2 queries based on this time boundary - one for offline and one for realtime. This query becomes - `select sum(metric) from table_REALTIME where date >= Mar 25`\
and `select sum(metric) from table_OFFLINE where date < Mar 25`&#x20;

\
The broker merges results from both these queries before returning the result to the client.

## Starting a Broker

Make sure you've [setup Zookeeper](https://docs.pinot.apache.org/release-0.9.0/basics/cluster#setup-a-pinot-cluster). If you're using docker, make sure to [pull the pinot docker image](https://docs.pinot.apache.org/release-0.9.0/basics/cluster#setup-a-pinot-cluster). To start a broker&#x20;

{% tabs %}
{% tab title="Docker Image" %}

```
docker run \
    --network=pinot-demo \
    --name pinot-broker \
    -d ${PINOT_IMAGE} StartBroker \
    -zkAddress pinot-zookeeper:2181
```

{% endtab %}

{% tab title="Launcher Script" %}

```
bin/pinot-admin.sh StartBroker \
  -zkAddress localhost:2181 \
  -clusterName PinotCluster \
  -brokerPort 7000
```

{% endtab %}
{% endtabs %}
