1 of 1

Table

Top-level fields

Property

Description

tableName

Specifies the name of the table. Should only contain alpha-numeric characters, hyphens (‘-‘), or underscores (‘’). (Using a double-underscore (‘_’) is not allowed and reserved for other features within Pinot)

tableType

Second level fields

The following properties can be nested inside the top-level configs.

Quota

Routing

Query

Segments Config

Table Index Config

Field Config List

Specify the columns and the type of indices to be created on those columns. Currently, only columns can be specified using this property. We will be migrating the rest of the indices to this field in future releases.

Realtime Table Config

We will now discuss the sections that are only applicable to realtime tables.

segmentsConfig

Indexing config

Below is the list of fields in streamConfigs section.

IndexingConfig -> streamConfig has been deprecated starting 0.7.0 or commit 9eaea9. Use IngestionConfig -> StreamIngestionConfig -> streamConfigMaps instead.

All the configurations that are prefixed with the streamType are expected to be used by the underlying stream. So, you can set any of the configurations described in the can be set using the prefix stream.kafka and Kafka should pay attention to it.

Example

Here is a minimal example of what the streamConfigs section may look like:

0.6.0 onwards:

0.5.0 and prior:

Tenants

Example

Environment Variables Override

Pinot allows users to define environment variables in the format of ${ENV_NAME} or ${ENV_NAME:DEFAULT_VALUE}as field values in table config.

Pinot instance will override it during runtime.

Brackets are required when defining the environment variable."$ENV_NAME"is not supported.

Environment variables used without default value in table config have to be available to all Pinot components - Controller, Broker, Server, and Minion. Otherwise, querying/consumption will be affected depending on the service to which these variables are not available.

Below is an example of setting AWS credential as part of table config using environment variable.

Example:

Sample Configurations

Offline Table

Realtime Table

Here's an example table config for a realtime table. All the fields from the offline table config are valid for the realtime table. Additionally, realtime tables use some extra fields.

Table

Top-level fields

Property

Description

tableName

tableType

Second level fields

The following properties can be nested inside the top-level configs.

Quota

Routing

Query

Segments Config

Table Index Config

Field Config List

Realtime Table Config

We will now discuss the sections that are only applicable to realtime tables.

segmentsConfig

Indexing config

Below is the list of fields in streamConfigs section.

IndexingConfig -> streamConfig has been deprecated starting 0.7.0 or commit 9eaea9. Use IngestionConfig -> StreamIngestionConfig -> streamConfigMaps instead.

Example

Here is a minimal example of what the streamConfigs section may look like:

0.6.0 onwards:

0.5.0 and prior:

Tenants

Example

Environment Variables Override

Pinot allows users to define environment variables in the format of ${ENV_NAME} or ${ENV_NAME:DEFAULT_VALUE}as field values in table config.

Pinot instance will override it during runtime.

Brackets are required when defining the environment variable."$ENV_NAME"is not supported.

Below is an example of setting AWS credential as part of table config using environment variable.

Example:

Sample Configurations

Offline Table

Realtime Table

Here's an example table config for a realtime table. All the fields from the offline table config are valid for the realtime table. Additionally, realtime tables use some extra fields.

segmentPartitionConfig

The map from column to partition function, which indicates how the segment is partitioned.

Currently 4 types of partition functions are supported:

Murmur - murmur2 hash function

Modulo - modulo on integer values

HashCode - java hashCode() function

ByteArray

(0.6.0 onwards) realtime.segment.flush.threshold.segment.size

(0.5.0 and prior) (deprecated)

~~realtime.segment.flush.desired.size~~

Desired size of the completed segments. This value can be set as a human readable string such as 150M, or 1.1G, etc. This value is used when realtime.segment.flush.threshold.size is set to 0. Default is 200M i.e. 200 MegaBytes

JSON of key-value pairs containing additional properties associated with the index. The following properties are supported currently -

enableQueryCacheForTextIndex - set to true to enable caching for text index in Lucene
rawIndexWriterVersion
deriveNumDocsPerChunkForRawIndex

pinot-table-offline.json

"OFFLINE": {
    "tableName": "pinotTable",
    "tableType": "OFFLINE",
    "quota": {
      "maxQueriesPerSecond": 300,
      "storage": "140G"
    },
    "routing": {
      "segmentPrunerTypes": ["partition"],
      "instanceSelectorType": "replicaGroup"
    },
    "segmentsConfig": {
      "schemaName": "pinotTable",
      "timeColumnName": "daysSinceEpoch",
      "timeType": "DAYS",
      "allowNullTimeValue": false,
      "replication": "3",
      "retentionTimeUnit": "DAYS",
      "retentionTimeValue": "365",
      "segmentPushFrequency": "DAILY",
      "segmentPushType": "APPEND"
    },
    "tableIndexConfig": {
      "invertedIndexColumns": ["foo", "bar", "moo"],
      "createInvertedIndexDuringSegmentGeneration": false,
      "sortedColumn": ["pk"],
      "bloomFilterColumns": [],
      "starTreeIndexConfigs": [],
      "noDictionaryColumns": [],
      "rangeIndexColumns": [],
      "onHeapDictionaryColumns": [],
      "varLengthDictionaryColumns": [],
      "segmentPartitionConfig": {
        "pk": {
          "functionName": "Murmur",
          "numPartitions": 32
        }
      }
      "loadMode": "MMAP",
      "columnMinMaxValueGeneratorMode": null,
      "nullHandlingEnabled": false
    },
    "tenants": {
      "broker": "myBrokerTenant",
      "server": "myServerTenant"
    },
    "ingestionConfig": {
      "filterConfig": {
        "filterFunction": "Groovy({foo == \"VALUE1\"}, foo)"
      },
      "transformConfigs": [{
        "columnName": "bar",
        "transformFunction": "lower(moo)"
      },
      {
        "columnName": "hoursSinceEpoch",
        "transformFunction": "toEpochHours(millis)"
      }]
    }
    "metadata": {
      "customConfigs": {
        "key": "value",
        "key": "value"
      }
    }
  }
}

pinot-table-realtime.json

"REALTIME": {
    "tableName": "pinotTable",
    "tableType": "REALTIME",
    "segmentsConfig": {
      "schemaName": "pinotTable",
      "timeColumnName": "daysSinceEpoch",
      "timeType": "DAYS",
      "allowNullTimeValue": true,
      "replicasPerPartition": "3",
      "retentionTimeUnit": "DAYS",
      "retentionTimeValue": "5",
      "segmentPushType": "APPEND",
      "completionConfig": {
        "completionMode": "DOWNLOAD"
      }
    },
    "tableIndexConfig": {
      "invertedIndexColumns": ["foo", "bar", "moo"],
      "sortedColumn": ["column1"],
      "noDictionaryColumns": ["metric1", "metric2"],
      "loadMode": "MMAP",
      "aggregateMetrics": true,
      "nullHandlingEnabled": false,
      "streamConfigs": {
        "realtime.segment.flush.threshold.size": "0",
        "realtime.segment.flush.threshold.time": "24h",
        "realtime.segment.flush.desired.size": "150M",
        "stream.kafka.broker.list": "XXXX",
        "stream.kafka.consumer.factory.class.name": "XXXX",
        "stream.kafka.consumer.prop.auto.offset.reset": "largest",
        "stream.kafka.consumer.type": "XXXX",
        "stream.kafka.decoder.class.name": "XXXX",
        "stream.kafka.decoder.prop.schema.registry.rest.url": "XXXX",
        "stream.kafka.decoder.prop.schema.registry.schema.name": "XXXX",
        "stream.kafka.hlc.zk.connect.string": "XXXX",
        "stream.kafka.topic.name": "XXXX",
        "stream.kafka.zk.broker.url": "XXXX",
        "streamType": "kafka"
      }
    },
    "tenants": {
      "broker": "myBrokerTenant",
      "server": "myServerTenant",
      "tagOverrideConfig": {}
    },
    "metadata": {
    }
}

pinot-table-offline.json

"OFFLINE": {
    "tableName": "pinotTable",
    "tableType": "OFFLINE",
    "quota": {
      "maxQueriesPerSecond": 300,
      "storage": "140G"
    },
    "routing": {
      "segmentPrunerTypes": ["partition"],
      "instanceSelectorType": "replicaGroup"
    },
    "segmentsConfig": {
      "schemaName": "pinotTable",
      "timeColumnName": "daysSinceEpoch",
      "timeType": "DAYS",
      "allowNullTimeValue": false,
      "replication": "3",
      "retentionTimeUnit": "DAYS",
      "retentionTimeValue": "365",
      "segmentPushFrequency": "DAILY",
      "segmentPushType": "APPEND"
    },
    "tableIndexConfig": {
      "invertedIndexColumns": ["foo", "bar", "moo"],
      "createInvertedIndexDuringSegmentGeneration": false,
      "sortedColumn": ["pk"],
      "bloomFilterColumns": [],
      "starTreeIndexConfigs": [],
      "noDictionaryColumns": [],
      "rangeIndexColumns": [],
      "onHeapDictionaryColumns": [],
      "varLengthDictionaryColumns": [],
      "segmentPartitionConfig": {
        "pk": {
          "functionName": "Murmur",
          "numPartitions": 32
        }
      }
      "loadMode": "MMAP",
      "columnMinMaxValueGeneratorMode": null,
      "nullHandlingEnabled": false
    },
    "tenants": {
      "broker": "myBrokerTenant",
      "server": "myServerTenant"
    },
    "ingestionConfig": {
      "filterConfig": {
        "filterFunction": "Groovy({foo == \"VALUE1\"}, foo)"
      },
      "transformConfigs": [{
        "columnName": "bar",
        "transformFunction": "lower(moo)"
      },
      {
        "columnName": "hoursSinceEpoch",
        "transformFunction": "toEpochHours(millis)"
      }]
    }
    "metadata": {
      "customConfigs": {
        "key": "value",
        "key": "value"
      }
    }
  }
}

pinot-table-realtime.json

"REALTIME": {
    "tableName": "pinotTable",
    "tableType": "REALTIME",
    "segmentsConfig": {
      "schemaName": "pinotTable",
      "timeColumnName": "daysSinceEpoch",
      "timeType": "DAYS",
      "allowNullTimeValue": true,
      "replicasPerPartition": "3",
      "retentionTimeUnit": "DAYS",
      "retentionTimeValue": "5",
      "segmentPushType": "APPEND",
      "completionConfig": {
        "completionMode": "DOWNLOAD"
      }
    },
    "tableIndexConfig": {
      "invertedIndexColumns": ["foo", "bar", "moo"],
      "sortedColumn": ["column1"],
      "noDictionaryColumns": ["metric1", "metric2"],
      "loadMode": "MMAP",
      "aggregateMetrics": true,
      "nullHandlingEnabled": false,
      "streamConfigs": {
        "realtime.segment.flush.threshold.size": "0",
        "realtime.segment.flush.threshold.time": "24h",
        "realtime.segment.flush.desired.size": "150M",
        "stream.kafka.broker.list": "XXXX",
        "stream.kafka.consumer.factory.class.name": "XXXX",
        "stream.kafka.consumer.prop.auto.offset.reset": "largest",
        "stream.kafka.consumer.type": "XXXX",
        "stream.kafka.decoder.class.name": "XXXX",
        "stream.kafka.decoder.prop.schema.registry.rest.url": "XXXX",
        "stream.kafka.decoder.prop.schema.registry.schema.name": "XXXX",
        "stream.kafka.hlc.zk.connect.string": "XXXX",
        "stream.kafka.topic.name": "XXXX",
        "stream.kafka.zk.broker.url": "XXXX",
        "streamType": "kafka"
      }
    },
    "tenants": {
      "broker": "myBrokerTenant",
      "server": "myServerTenant",
      "tagOverrideConfig": {}
    },
    "metadata": {
    }
}

Table

hashtagTop-level fields

hashtagSecond level fields

hashtagQuota

hashtagRouting

hashtagQuery

hashtagSegments Config

hashtagTable Index Config

hashtagField Config List

hashtagRealtime Table Config

hashtagsegmentsConfig

hashtagIndexing config

hashtagExample

hashtagTenants

hashtagExample

hashtagEnvironment Variables Override

hashtagSample Configurations

hashtagOffline Table

hashtagRealtime Table

Table

hashtagTop-level fields

hashtagSecond level fields

hashtagQuota

hashtagRouting

hashtagQuery

hashtagSegments Config

hashtagTable Index Config

hashtagField Config List

hashtagRealtime Table Config

hashtagsegmentsConfig

hashtagIndexing config

hashtagExample

hashtagTenants

hashtagExample

hashtagEnvironment Variables Override

hashtagSample Configurations

hashtagOffline Table

hashtagRealtime Table

Top-level fields

Second level fields

Quota

Routing

Query

Segments Config

Table Index Config

Field Config List

Realtime Table Config

segmentsConfig

Indexing config

Example

Tenants

Example

Environment Variables Override

Sample Configurations

Offline Table

Realtime Table

Top-level fields

Second level fields

Quota

Routing

Query

Segments Config

Table Index Config

Field Config List

Realtime Table Config

segmentsConfig

Indexing config

Example

Tenants

Example

Environment Variables Override

Sample Configurations

Offline Table

Realtime Table