OFFLINErepresents the instances hosting the segments for the offline table;
CONSUMINGrepresents the instances hosting the consuming segments for the real-time table;
COMPLETEDrepresents the instances hosting the completed segments for the real-time table. For real-time table, if
COMPLETEDinstances are not configured, completed segments will use the same instance assignment strategy as the consuming segments. If it is configured, completed segments will be automatically moved to the
_REALTIME) denotes the type of table the server is going to serve. Each server can have multiple tags if necessary.
tagwithin the InstanceAssignmentConfig for the table as shown below. Only the servers with this tag will be assigned to host this table, and the table will use the Balanced Segment Assignment.
numInstancesin the InstanceAssignmentConfig. This is useful when we want to serve multiple tables of different sizes on the same set of servers. For example, suppose we have 30 servers hosting hundreds of tables for different analytics, we don’t want to use all 30 servers for each table, especially the tiny tables with only megabytes of data.
numInstancesPerReplicaGroupin the InstanceAssignmentConfig, and Pinot will assign the instances accordingly.
numInstancesPerPartitionin the InstanceAssignmentConfig can fulfill the requirement.
numPartitionsconfigured here does not have to match the actual number of partitions for the table in case the partitions of the table changed for some reason. If they do not match, the table partition will be assigned to the server partition in a round-robin fashion. For example, if there are 2 server partitions, but 4 table partitions, table partition 1 and 3 will be assigned to server partition 1, and table partition 2 and 4 will be assigned to server partition 2.)
numPartitionsthe same as the number of stream partitions, and
numInstancesPerPartitionof 1, and we don't allow configuring them explicitly. The replica-group based instance assignment can still be configured explicitly.
poolBasedin the InstanceAssignmentConfig. All the tables in this cluster should use the Replica-Group Instance Assignment, and Pinot will assign servers from different pools to each replica-group of the table. It is guaranteed that servers within one pool only host one replica of any table, and it is okay to shut down all servers within one pool without bringing down any table. This can significantly reduce the deploy time of the cluster, where the 100 servers for the above example can be restarted in 2 rounds (less than an hour) instead of 100 rounds (days).