Last updated
Last updated
Colocated joins is a strategy Pinot can use execute joins without shuffling data between servers when all the following conditions are met:
Both tables are partitioned.
The partition function is the same.
The number of partitions is the same.
The join condition is an equality between the partition columns.
The assignment of partitions to servers is the same.
For each partition, there is a server that has all the segments of the partition for both tables.
As an example, imagine the following scenario where we have tables A and B partitioned in a compatible way and distributed in such a way that:
The 1st partition of A is formed by segments A0-1 and A0-2, which stored in servers 1 and 3.
The 2nd partition of A is formed by segments A1-1 and A1-2, which stored in server 2.
The 1st partition of B is formed by segments B0-1 and B0-2, which stored in servers 3.
The 2nd partition of B is formed by segments B1-1 and B1-2, which stored in server 2.
If we execute the following query:
Then Pinot would try to execute the query in the following way:
Colocated join optimization is disabled by default in Pinot 1.3.0.
It can be enabled cluster wise by setting the following configuration in the broker:
It can also be enabled/disabled on a per-query basis by setting the following query option:
As noticed above, in order to use colocated joins, the assignment of partitions to servers must be the same for both tables. Although we can manually assign partitions to servers when creating the tables, they can be moved between servers at any time as a result of a rebalance.
As explained, the main advantage when this optimization is enabled is that data doesn't need to be shuffled to execute the join. That can be verified by with the rawMessages
and inMemoryMessages
stats on the mailbox send operator for this stage. All messages should be inMemoryMessages
and rawMessages
should be 0 (or being not listed at all).
Another way to verify this optimization is being applied is to use the EXPLAIN IMPLEMENTATION PLAN
command. In order to see if the optimization is being applied you need to use the EXPLAIN IMPLEMENTATION PLAN
command. There you will see that MAIL_SEND
operators are decorated with [PARTITIONED]
and each MAIL_SEND
will send the data to another worker in the same server.
Notice that this optimization cannot be seen in the normal EXPLAIN PLAN
command.
As a side effect, this strategy may not use as many servers as other techniques. For example, the same query using may use 3 servers, while in this case Pinot can only use server 3 and server 2. Server 1 cannot be used because it does not have all the segments for partition 2 of table B.
In order to guarantee that colocated joins can be used it is recommended to instruct Pinot to for each partition in both tables. To read more about how to partition a table, see and .