Batch ingestion of data into Apache Pinot using dimension tables.
Dimension tables are a special kind of offline tables from which data can be looked up via the lookup UDF, providing join-like functionality.
Dimension tables are replicated on all the hosts for a given tenant to allow faster lookups.
To mark an offline table as a dimension table, isDimTable should be set to true and segmentsConfig.segementPushType should be set to REFRESH in the table config, like this:
When a table is marked as a dimension table, it will be replicated on all the hosts, which means that these tables must be small in size.
The maximum size quota for a dimension table in a cluster is controlled by the controller.dimTable.maxSize controller property. Table creation will fail if the storage quota exceeds this maximum size.