Dimension table
Batch ingestion of data into Apache Pinot using dimension tables.
Was this helpful?
Batch ingestion of data into Apache Pinot using dimension tables.
Was this helpful?
Dimension tables are a special kind of offline tables from which data can be looked up via the , providing join-like functionality.
Dimension tables are replicated on all the hosts for a given tenant to allow faster lookups.
To mark an offline table as a dimension table, isDimTable
should be set to true and segmentsConfig.segementPushType
should be set to REFRESH in the table config, like this:
As dimension tables are used to perform lookups of dimension values, they are required to have a primary key (can be a composite key).
When a table is marked as a dimension table, it will be replicated on all the hosts, which means that these tables must be small in size.
The maximum size quota for a dimension table in a cluster is controlled by the controller.dimTable.maxSize
controller property. Table creation will fail if the storage quota exceeds this maximum size.
A dimension table cannot be part of a .