1 of 1

Table

Top-level fields

Property

Description

tableName

Specifies the name of the table. Should only contain alpha-numeric characters, hyphens (‘-‘), or underscores (‘’). (Using a double-underscore (‘_’) is not allowed and reserved for other features within Pinot)

tableType

Second level fields

The following properties can be nested inside the top-level configs.

Quota

Property

Description

Routing

Property

Description

Query

Property

Description

Segments Config

Property

Description

Table Index Config

Property

Description

Field Config List

Specify the columns and the type of indices to be created on those columns. Currently, only columns can be specified using this property. We will be migrating the rest of the indices to this field in future releases.

Property

Realtime Table Config

We will now discuss the sections that are only applicable to realtime tables.

segmentsConfig

Property

Description

Indexing config

Below is the list of fields in streamConfigs section.

IndexingConfig -> streamConfig has been deprecated starting 0.7.0 or commit 9eaea9. Use IngestionConfig -> StreamIngestionConfig -> streamConfigMaps instead.

Property

Description

When specifying realtime.segment.flush.threshold.rows, the actual number of rows per segment is computed using the following formula: `` realtime.segment.flush.threshold.rows / partitionsConsumedByServer

This means that if we set realtime.segment.flush.threshold.rows=1000 and each server consumes 10 partitions, the rows per segment will be:1000/10 = 100

Any additional properties set here will be directly available to the stream consumers. For example, in case of Kafka stream, you could put any of the configs described in , and it will be automatically passed to the KafkaConsumer.

Some of the properties you might want to set:

Config

Description

Values

Example

Here is a minimal example of what the streamConfigs section may look like:

0.6.0 onwards:

0.5.0 and prior:

Tenants

Property

Description

Example

Environment Variables Override

Pinot allows users to define environment variables in the format of ${ENV_NAME} or ${ENV_NAME:DEFAULT_VALUE}as field values in table config.

Pinot instance will override it during runtime.

Brackets are required when defining the environment variable."$ENV_NAME"is not supported.

Environment variables used without default value in table config have to be available to all Pinot components - Controller, Broker, Server, and Minion. Otherwise, querying/consumption will be affected depending on the service to which these variables are not available.

Below is an example of setting AWS credential as part of table config using environment variable.

Example:

Sample Configurations

Offline Table

Realtime Table

Here's an example table config for a realtime table. All the fields from the offline table config are valid for the realtime table. Additionally, realtime tables use some extra fields.

Table

Top-level fields

Property

Description

tableName

tableType

Second level fields

The following properties can be nested inside the top-level configs.

Quota

Property

Description

Routing

Property

Description

Query

Property

Description

Segments Config

Property

Description

Table Index Config

Property

Description

Field Config List

Property

Realtime Table Config

We will now discuss the sections that are only applicable to realtime tables.

segmentsConfig

Property

Description

Indexing config

Below is the list of fields in streamConfigs section.

IndexingConfig -> streamConfig has been deprecated starting 0.7.0 or commit 9eaea9. Use IngestionConfig -> StreamIngestionConfig -> streamConfigMaps instead.

Property

Description

This means that if we set realtime.segment.flush.threshold.rows=1000 and each server consumes 10 partitions, the rows per segment will be:1000/10 = 100

Some of the properties you might want to set:

Config

Description

Values

Example

Here is a minimal example of what the streamConfigs section may look like:

0.6.0 onwards:

0.5.0 and prior:

Tenants

Property

Description

Example

Environment Variables Override

Pinot allows users to define environment variables in the format of ${ENV_NAME} or ${ENV_NAME:DEFAULT_VALUE}as field values in table config.

Pinot instance will override it during runtime.

Brackets are required when defining the environment variable."$ENV_NAME"is not supported.

Below is an example of setting AWS credential as part of table config using environment variable.

Example:

Sample Configurations

Offline Table

Realtime Table

Here's an example table config for a realtime table. All the fields from the offline table config are valid for the realtime table. Additionally, realtime tables use some extra fields.

segmentPartitionConfig

The map from column to partition function, which indicates how the segment is partitioned.

Currently 4 types of partition functions are supported:

Murmur - murmur2 hash function

Modulo - modulo on integer values

HashCode - java hashCode() function

ByteArray

pinot-table-offline.json

"OFFLINE": {
    "tableName": "pinotTable",
    "tableType": "OFFLINE",
    "quota": {
      "maxQueriesPerSecond": 300,
      "storage": "140G"
    },
    "routing": {
      "segmentPrunerTypes": ["partition"],
      "instanceSelectorType": "replicaGroup"
    },
    "segmentsConfig": {
      "schemaName": "pinotTable",
      "timeColumnName": "daysSinceEpoch",
      "timeType": "DAYS",
      "allowNullTimeValue": false,
      "replication": "3",
      "retentionTimeUnit": "DAYS",
      "retentionTimeValue": "365",
      "segmentPushFrequency": "DAILY",
      "segmentPushType": "APPEND"
    },
    "tableIndexConfig": {
      "invertedIndexColumns": ["foo", "bar", "moo"],
      "createInvertedIndexDuringSegmentGeneration": false,
      "sortedColumn": ["pk"],
      "bloomFilterColumns": [],
      "starTreeIndexConfigs": [],
      "noDictionaryColumns": [],
      "rangeIndexColumns": [],
      "onHeapDictionaryColumns": [],
      "varLengthDictionaryColumns": [],
      "segmentPartitionConfig": {
        "pk": {
          "functionName": "Murmur",
          "numPartitions": 32
        }
      }
      "loadMode": "MMAP",
      "columnMinMaxValueGeneratorMode": null,
      "nullHandlingEnabled": false
    },
    "tenants": {
      "broker": "myBrokerTenant",
      "server": "myServerTenant"
    },
    "ingestionConfig": {
      "filterConfig": {
        "filterFunction": "Groovy({foo == \"VALUE1\"}, foo)"
      },
      "transformConfigs": [{
        "columnName": "bar",
        "transformFunction": "lower(moo)"
      },
      {
        "columnName": "hoursSinceEpoch",
        "transformFunction": "toEpochHours(millis)"
      }]
    }
    "metadata": {
      "customConfigs": {
        "key": "value",
        "key": "value"
      }
    }
  }
}

pinot-table-realtime.json

"REALTIME": {
    "tableName": "pinotTable",
    "tableType": "REALTIME",
    "segmentsConfig": {
      "schemaName": "pinotTable",
      "timeColumnName": "daysSinceEpoch",
      "timeType": "DAYS",
      "allowNullTimeValue": true,
      "replicasPerPartition": "3",
      "retentionTimeUnit": "DAYS",
      "retentionTimeValue": "5",
      "segmentPushType": "APPEND",
      "completionConfig": {
        "completionMode": "DOWNLOAD"
      }
    },
    "tableIndexConfig": {
      "invertedIndexColumns": ["foo", "bar", "moo"],
      "sortedColumn": ["column1"],
      "noDictionaryColumns": ["metric1", "metric2"],
      "loadMode": "MMAP",
      "nullHandlingEnabled": false,
      "streamConfigs": {
        "realtime.segment.flush.threshold.rows": "0",
        "realtime.segment.flush.threshold.time": "24h",
        "realtime.segment.flush.threshold.segment.size": "150M",
        "stream.kafka.broker.list": "XXXX",
        "stream.kafka.consumer.factory.class.name": "XXXX",
        "stream.kafka.consumer.prop.auto.offset.reset": "largest",
        "stream.kafka.consumer.type": "XXXX",
        "stream.kafka.decoder.class.name": "XXXX",
        "stream.kafka.decoder.prop.schema.registry.rest.url": "XXXX",
        "stream.kafka.decoder.prop.schema.registry.schema.name": "XXXX",
        "stream.kafka.hlc.zk.connect.string": "XXXX",
        "stream.kafka.topic.name": "XXXX",
        "stream.kafka.zk.broker.url": "XXXX",
        "streamType": "kafka"
      }
    },
    "tenants": {
      "broker": "myBrokerTenant",
      "server": "myServerTenant",
      "tagOverrideConfig": {}
    },
    "metadata": {
    }
}

pinot-table-offline.json

"OFFLINE": {
    "tableName": "pinotTable",
    "tableType": "OFFLINE",
    "quota": {
      "maxQueriesPerSecond": 300,
      "storage": "140G"
    },
    "routing": {
      "segmentPrunerTypes": ["partition"],
      "instanceSelectorType": "replicaGroup"
    },
    "segmentsConfig": {
      "schemaName": "pinotTable",
      "timeColumnName": "daysSinceEpoch",
      "timeType": "DAYS",
      "allowNullTimeValue": false,
      "replication": "3",
      "retentionTimeUnit": "DAYS",
      "retentionTimeValue": "365",
      "segmentPushFrequency": "DAILY",
      "segmentPushType": "APPEND"
    },
    "tableIndexConfig": {
      "invertedIndexColumns": ["foo", "bar", "moo"],
      "createInvertedIndexDuringSegmentGeneration": false,
      "sortedColumn": ["pk"],
      "bloomFilterColumns": [],
      "starTreeIndexConfigs": [],
      "noDictionaryColumns": [],
      "rangeIndexColumns": [],
      "onHeapDictionaryColumns": [],
      "varLengthDictionaryColumns": [],
      "segmentPartitionConfig": {
        "pk": {
          "functionName": "Murmur",
          "numPartitions": 32
        }
      }
      "loadMode": "MMAP",
      "columnMinMaxValueGeneratorMode": null,
      "nullHandlingEnabled": false
    },
    "tenants": {
      "broker": "myBrokerTenant",
      "server": "myServerTenant"
    },
    "ingestionConfig": {
      "filterConfig": {
        "filterFunction": "Groovy({foo == \"VALUE1\"}, foo)"
      },
      "transformConfigs": [{
        "columnName": "bar",
        "transformFunction": "lower(moo)"
      },
      {
        "columnName": "hoursSinceEpoch",
        "transformFunction": "toEpochHours(millis)"
      }]
    }
    "metadata": {
      "customConfigs": {
        "key": "value",
        "key": "value"
      }
    }
  }
}

pinot-table-realtime.json

"REALTIME": {
    "tableName": "pinotTable",
    "tableType": "REALTIME",
    "segmentsConfig": {
      "schemaName": "pinotTable",
      "timeColumnName": "daysSinceEpoch",
      "timeType": "DAYS",
      "allowNullTimeValue": true,
      "replicasPerPartition": "3",
      "retentionTimeUnit": "DAYS",
      "retentionTimeValue": "5",
      "segmentPushType": "APPEND",
      "completionConfig": {
        "completionMode": "DOWNLOAD"
      }
    },
    "tableIndexConfig": {
      "invertedIndexColumns": ["foo", "bar", "moo"],
      "sortedColumn": ["column1"],
      "noDictionaryColumns": ["metric1", "metric2"],
      "loadMode": "MMAP",
      "nullHandlingEnabled": false,
      "streamConfigs": {
        "realtime.segment.flush.threshold.rows": "0",
        "realtime.segment.flush.threshold.time": "24h",
        "realtime.segment.flush.threshold.segment.size": "150M",
        "stream.kafka.broker.list": "XXXX",
        "stream.kafka.consumer.factory.class.name": "XXXX",
        "stream.kafka.consumer.prop.auto.offset.reset": "largest",
        "stream.kafka.consumer.type": "XXXX",
        "stream.kafka.decoder.class.name": "XXXX",
        "stream.kafka.decoder.prop.schema.registry.rest.url": "XXXX",
        "stream.kafka.decoder.prop.schema.registry.schema.name": "XXXX",
        "stream.kafka.hlc.zk.connect.string": "XXXX",
        "stream.kafka.topic.name": "XXXX",
        "stream.kafka.zk.broker.url": "XXXX",
        "streamType": "kafka"
      }
    },
    "tenants": {
      "broker": "myBrokerTenant",
      "server": "myServerTenant",
      "tagOverrideConfig": {}
    },
    "metadata": {
    }
}

Table

hashtagTop-level fields

hashtagSecond level fields

hashtagQuota

hashtagRouting

hashtagQuery

hashtagSegments Config

hashtagTable Index Config

hashtagField Config List

hashtagRealtime Table Config

hashtagsegmentsConfig

hashtagIndexing config

hashtagExample

hashtagTenants

hashtagExample

hashtagEnvironment Variables Override

hashtagSample Configurations

hashtagOffline Table

hashtagRealtime Table

Table

hashtagTop-level fields

hashtagSecond level fields

hashtagQuota

hashtagRouting

hashtagQuery

hashtagSegments Config

hashtagTable Index Config

hashtagField Config List

hashtagRealtime Table Config

hashtagsegmentsConfig

hashtagIndexing config

hashtagExample

hashtagTenants

hashtagExample

hashtagEnvironment Variables Override

hashtagSample Configurations

hashtagOffline Table

hashtagRealtime Table

Top-level fields

Second level fields

Quota

Routing

Query

Segments Config

Table Index Config

Field Config List

Realtime Table Config

segmentsConfig

Indexing config

Example

Tenants

Example

Environment Variables Override

Sample Configurations

Offline Table

Realtime Table

Top-level fields

Second level fields

Quota

Routing

Query

Segments Config

Table Index Config

Field Config List

Realtime Table Config

segmentsConfig

Indexing config

Example

Tenants

Example

Environment Variables Override

Sample Configurations

Offline Table

Realtime Table