LogoLogo
release-1.0.0
release-1.0.0
  • Introduction
  • Basics
    • Concepts
    • Architecture
    • Components
      • Cluster
        • Tenant
        • Server
        • Controller
        • Broker
        • Minion
      • Table
        • Segment
          • Deep Store
        • Schema
      • Pinot Data Explorer
    • Getting Started
      • Running Pinot locally
      • Running Pinot in Docker
      • Quick Start Examples
      • Running in Kubernetes
      • Running on public clouds
        • Running on Azure
        • Running on GCP
        • Running on AWS
      • Batch import example
      • Stream ingestion example
      • HDFS as Deep Storage
      • Troubleshooting Pinot
      • Frequently Asked Questions (FAQs)
        • General
        • Pinot On Kubernetes FAQ
        • Ingestion FAQ
        • Query FAQ
        • Operations FAQ
    • Import Data
      • From Query Console
      • Batch Ingestion
        • Spark
        • Flink
        • Hadoop
        • Backfill Data
        • Dimension table
      • Stream ingestion
        • Apache Kafka
        • Amazon Kinesis
        • Apache Pulsar
      • Stream Ingestion with Upsert
      • Stream Ingestion with Dedup
      • Stream Ingestion with CLP
      • File Systems
        • Amazon S3
        • Azure Data Lake Storage
        • HDFS
        • Google Cloud Storage
      • Input formats
      • Complex Type (Array, Map) Handling
      • Reload a table segment
      • Upload a table segment
    • Indexing
      • Forward Index
      • Inverted Index
      • Star-Tree Index
      • Bloom Filter
      • Range Index
      • Native Text Index
      • Text search support
      • JSON Index
      • Geospatial
      • Timestamp Index
    • Releases
      • Apache Pinotâ„¢ 1.0.0 release notes
      • 0.12.1
      • 0.12.0
      • 0.11.0
      • 0.10.0
      • 0.9.3
      • 0.9.2
      • 0.9.1
      • 0.9.0
      • 0.8.0
      • 0.7.1
      • 0.6.0
      • 0.5.0
      • 0.4.0
      • 0.3.0
      • 0.2.0
      • 0.1.0
    • Recipes
      • GitHub Events Stream
  • For Users
    • Query
      • Querying Pinot
      • Querying JSON data
      • Query Options
      • Aggregation Functions
      • Cardinality Estimation
      • Explain Plan
      • Filtering with IdSet
      • GapFill Function For Time-Series Dataset
      • Grouping Algorithm
      • JOINs
      • Lookup UDF Join
      • Transformation Functions
      • User-Defined Functions (UDFs)
      • Window functions
    • APIs
      • Broker Query API
        • Query Response Format
      • Controller Admin API
      • Controller API Reference
    • External Clients
      • JDBC
      • Java
      • Python
      • Golang
    • Tutorials
      • Use OSS as Deep Storage for Pinot
      • Ingest Parquet Files from S3 Using Spark
      • Creating Pinot Segments
      • Use S3 as Deep Storage for Pinot
      • Use S3 and Pinot in Docker
      • Batch Data Ingestion In Practice
      • Schema Evolution
  • For Developers
    • Basics
      • Extending Pinot
        • Writing Custom Aggregation Function
        • Segment Fetchers
      • Contribution Guidelines
      • Code Setup
      • Code Modules and Organization
      • Update documentation
    • Advanced
      • Data Ingestion Overview
      • Ingestion Aggregations
      • Ingestion Transformations
      • Null Value Support
      • Use the multi-stage query engine (v2)
      • Troubleshoot issues with the multi-stage query engine (v2)
      • Advanced Pinot Setup
    • Plugins
      • Write Custom Plugins
        • Input Format Plugin
        • Filesystem Plugin
        • Batch Segment Fetcher Plugin
        • Stream Ingestion Plugin
    • Design Documents
      • Segment Writer API
  • For Operators
    • Deployment and Monitoring
      • Set up cluster
      • Server Startup Status Checkers
      • Set up table
      • Set up ingestion
      • Decoupling Controller from the Data Path
      • Segment Assignment
      • Instance Assignment
      • Rebalance
        • Rebalance Servers
        • Rebalance Brokers
      • Separating data storage by age
        • Using multiple tenants
        • Using multiple directories
      • Pinot managed Offline flows
      • Minion merge rollup task
      • Consistent Push and Rollback
      • Access Control
      • Monitoring
      • Tuning
        • Real-time
        • Routing
        • Query Routing using Adaptive Server Selection
        • Query Scheduling
      • Upgrading Pinot with confidence
      • Managing Logs
      • OOM Protection Using Automatic Query Killing
    • Command-Line Interface (CLI)
    • Configuration Recommendation Engine
    • Tutorials
      • Authentication
        • Basic auth access control
        • ZkBasicAuthAccessControl
      • Configuring TLS/SSL
      • Build Docker Images
      • Running Pinot in Production
      • Kubernetes Deployment
      • Amazon EKS (Kafka)
      • Amazon MSK (Kafka)
      • Monitor Pinot using Prometheus and Grafana
      • Performance Optimization Configurations
  • Configuration Reference
    • Cluster
    • Controller
    • Broker
    • Server
    • Table
    • Schema
    • Ingestion Job Spec
    • Monitoring Metrics
    • Functions
      • ABS
      • ADD
      • ago
      • ARG_MIN / ARG_MAX
      • arrayConcatDouble
      • arrayConcatFloat
      • arrayConcatInt
      • arrayConcatLong
      • arrayConcatString
      • arrayContainsInt
      • arrayContainsString
      • arrayDistinctInt
      • arrayDistinctString
      • arrayIndexOfInt
      • arrayIndexOfString
      • ARRAYLENGTH
      • arrayRemoveInt
      • arrayRemoveString
      • arrayReverseInt
      • arrayReverseString
      • arraySliceInt
      • arraySliceString
      • arraySortInt
      • arraySortString
      • arrayUnionInt
      • arrayUnionString
      • AVGMV
      • Base64
      • caseWhen
      • ceil
      • CHR
      • codepoint
      • concat
      • count
      • COUNTMV
      • COVAR_POP
      • COVAR_SAMP
      • day
      • dayOfWeek
      • dayOfYear
      • DISTINCT
      • DISTINCTAVG
      • DISTINCTAVGMV
      • DISTINCTCOUNT
      • DISTINCTCOUNTBITMAP
      • DISTINCTCOUNTHLLMV
      • DISTINCTCOUNTHLL
      • DISTINCTCOUNTBITMAPMV
      • DISTINCTCOUNTMV
      • DISTINCTCOUNTRAWHLL
      • DISTINCTCOUNTRAWHLLMV
      • DISTINCTCOUNTRAWTHETASKETCH
      • DISTINCTCOUNTTHETASKETCH
      • DISTINCTSUM
      • DISTINCTSUMMV
      • DIV
      • DATETIMECONVERT
      • DATETRUNC
      • exp
      • FLOOR
      • FromDateTime
      • FromEpoch
      • FromEpochBucket
      • FUNNELCOUNT
      • Histogram
      • hour
      • isSubnetOf
      • JSONFORMAT
      • JSONPATH
      • JSONPATHARRAY
      • JSONPATHARRAYDEFAULTEMPTY
      • JSONPATHDOUBLE
      • JSONPATHLONG
      • JSONPATHSTRING
      • jsonextractkey
      • jsonextractscalar
      • length
      • ln
      • lower
      • lpad
      • ltrim
      • max
      • MAXMV
      • MD5
      • millisecond
      • min
      • minmaxrange
      • MINMAXRANGEMV
      • MINMV
      • minute
      • MOD
      • mode
      • month
      • mult
      • now
      • percentile
      • percentileest
      • percentileestmv
      • percentilemv
      • percentiletdigest
      • percentiletdigestmv
      • percentilekll
      • percentilerawkll
      • percentilekllmv
      • percentilerawkllmv
      • quarter
      • regexpExtract
      • regexpReplace
      • remove
      • replace
      • reverse
      • round
      • ROW_NUMBER
      • rpad
      • rtrim
      • second
      • SEGMENTPARTITIONEDDISTINCTCOUNT
      • sha
      • sha256
      • sha512
      • sqrt
      • startswith
      • ST_AsBinary
      • ST_AsText
      • ST_Contains
      • ST_Distance
      • ST_GeogFromText
      • ST_GeogFromWKB
      • ST_GeometryType
      • ST_GeomFromText
      • ST_GeomFromWKB
      • STPOINT
      • ST_Polygon
      • strpos
      • ST_Union
      • SUB
      • substr
      • sum
      • summv
      • TIMECONVERT
      • timezoneHour
      • timezoneMinute
      • ToDateTime
      • ToEpoch
      • ToEpochBucket
      • ToEpochRounded
      • TOJSONMAPSTR
      • toGeometry
      • toSphericalGeography
      • trim
      • upper
      • Url
      • UTF8
      • VALUEIN
      • week
      • year
      • yearOfWeek
      • Extract
    • Plugin Reference
      • Stream Ingestion Connectors
      • VAR_POP
      • VAR_SAMP
      • STDDEV_POP
      • STDDEV_SAMP
  • Reference
    • Single-stage query engine (v1)
    • Multi-stage query engine (v2)
  • RESOURCES
    • Community
    • Team
    • Blogs
    • Presentations
    • Videos
  • Integrations
    • Tableau
    • Trino
    • ThirdEye
    • Superset
    • Presto
    • Spark-Pinot Connector
  • Contributing
    • Contribute Pinot documentation
    • Style guide
Powered by GitBook
On this page
  • Top-level fields
  • Second-level fields
  • Quota
  • Routing
  • Query
  • Segments config
  • Table index config
  • Field Config List
  • Real-time table config
  • segmentsConfig
  • Indexing config
  • Tenants
  • Example
  • Environment variables override
  • Sample configurations
  • Offline table
  • Real-time table

Was this helpful?

Export as PDF
  1. Configuration Reference

Table

The tables below shows the properties available to set at the table level.

Top-level fields

Property
Description

tableName

Specifies the name of the table. Should only contain alpha-numeric characters, hyphens (‘-‘), or underscores (‘_’). (Two notes: While the hyphen is allowed in table names, it is also a reserved character in SQL, so if you use it you must remember to double quote the table name in your queries. Using a double-underscore (‘__’) is not allowed as it is reserved for other features within Pinot.)

tableType

Defines the table type: OFFLINE for offline tables or REALTIME for real-time tables. A hybrid table is essentially two table configurations: one of each type, with the same table name.

isDimTable

quota

task

routing

query

segmentsConfig

tableIndexConfig

fieldConfigList

tenants

ingestionConfig

upsertConfig

dedupConfig

tierConfigs

metadata

Contains other metadata of the table. There is a string to string map field "customConfigs" under it which is expressed as key-value pairs to hold the custom configurations.

Second-level fields

The following properties can be nested inside the top-level configurations.

Quota

Property
Description

storage

The maximum storage space the table is allowed to use before replication.

For example, in the above table, the storage is 140G and replication is 3, so the maximum storage the table is allowed to use is 140G x 3 = 420G. The space the table uses is calculated by adding up the sizes of all segments from every server hosting this table. Once this limit is reached, offline segment push throws a 403 exception with message, Quota check failed for segment: segment_0 of table: pinotTable.

maxQueriesPerSecond

The maximum queries per second allowed to execute on this table. If query volume exceeds this, a 429 exception with message Request 123 exceeds query quota for table:pinotTable, query:select count(*) from pinotTable will be sent, and a BrokerMetric QUERY_QUOTA_EXCEEDED will be recorded. The application should build an exponential backoff and retry mechanism to react to this exception.

Routing

Property
Description

segmentPrunerTypes

The list of segment pruners to be enabled.

The segment pruner prunes the selected segments based on the query.

Supported values:

  • partition: Prunes segments based on the partition metadata stored in zookeeper. By default, there is no pruner.

  • time: Prune segments for queries filtering on timeColumnName that do not contain data in the query's time range.

instanceSelectorType

The server instances to serve the query based on selected segments. Supported values:

  • balanced: Balances the number of segments served by each selected instance. Default.

  • replicaGroup: Instance selector for replica group routing strategy.

Query

Property
Description

timeoutMs

Query timeout in milliseconds

disableGroovy

Whether to disable groovy in query. This overrides the broker instance level config (pinot.broker.disable.query.groovy) if configured.

useApproximateFunction

Whether to automatically use approximate function for expensive aggregates, such as DISTINCT_COUNT and PERCENTILE. This overrides the broker instance level config (pinot.broker.use.approximate.function) if configured.

expressionOverrideMap

A map that configures the expressions to override in the query. This can be useful when users cannot control the queries sent to Pinot (e.g. queries auto-generated by some other tools), but want to override the expressions within the query (e.g. override a transform function to a derived column). Example: {"myFunc(a)": "b"}.

Segments config

Property
Description

schemaName

Name of the schema associated with the table

timeColumnName

The name of the time column for this table. This must match with the time column name in the schema. This is mandatory for tables with push type APPEND, optional for REFRESH. timeColumnName along with timeColumnType is used to manage segment retention and time boundary for offline vs real-time.

replication

Number of replicas for the tables. A replication value of 1 means segments won't be replicated across servers.

retentionTimeUnit

Unit for the retention, such as HOURS or DAYS. This, in combination with retentionTimeValue decides the duration for which to retain the segments.

For example, 365 DAYS means that segments containing data older than 365 days will be deleted periodically by the RetentionManager Controller periodic task. By default, there is no set retention.

retentionTimeValue

A numeric value for the retention. This, in combination with retentionTimeUnit, determines the duration for which to retain the segments

segmentPushType

(Deprecated starting 0.7.0 or commit 9eaea9. Use IngestionConfig -> BatchIngestionConfig -> segmentPushType )

Can be either:

  • APPEND: New data segments pushed periodically, to append to the existing data eg. daily or hourly

  • REFRESH: Entire data is replaced every time during a data push. Refresh tables have no retention.

segmentPushFrequency

(Deprecated starting 0.7.0 or commit 9eaea9. Use IngestionConfig -> BatchIngestionConfig -> segmentPushFrequency )

The cadence at which segments are pushed, such as HOURLY or DAILY

Table index config

Property
Description

invertedIndexColumns

The list of columns that inverted index should be created on. The name of columns should match the schema. e.g. in the table above, inverted index has been created on three columns foo, bar, moo

createInvertedIndexDuringSegmentGeneration

Boolean to indicate whether to create inverted indexes during the segment creation. By default, false i.e. inverted indexes are created when the segments are loaded on the server

sortedColumn

The column which is sorted in the data and hence will have a sorted index. This does not need to be specified for the offline table, as the segment generation job will automatically detect the sorted column in the data and create a sorted index for it.

bloomFilterColumns

bloomFilterConfigs

rangeIndexColumns

The list of columns that range index should be created on. Typically used for numeric columns and mostly on metrics. e.g. select count(*) from T where latency > 3000 will be faster if you enable range index for latency

rangeIndexVersion

Version of the range index, 2 (latest) by default.

starTreeIndexConfigs

enableDefaultStarTree

enableDynamicStarTreeCreation

Boolean to indicate whether to allow creating StarTree when server loads the segment. StarTree creation could potentially consume a lot of system resources, so this config should be enabled when the servers have the free system resources to create the StarTree.

noDictionaryColumns

onHeapDictionaryColumns

The list of columns for which the dictionary should be created on heap

varLengthDictionaryColumns

The list of columns for which the variable length dictionary needs to be enabled in offline segments. This is only valid for string and bytes columns and has no impact for columns of other data types.

jsonIndexColumns

jsonIndexConfigs

segmentPartitionConfig

  • functionName: Specify one of the supported functions:

    • Murmur:MurmurHash 2

    • Modulo: Modulo on integer values

    • HashCode: Java hashCode()

    • ByteArray: Java hashCode() on deserialized byte array

  • numPartitions: Number of partitions you want per segment. Controls how data is divided within each segment.

Example: {

"columnPartitionMap": { "column_memberID": { "functionName": "Murmur", "numPartitions": 32 } }

loadMode

Indicates how the segments will be loaded onto the server: heap - load data directly into direct memory mmap - load data segments to off-heap memory

columnMinMaxValueGeneratorMode

Generate min max values for columns. Supported values: NONE - do not generate for any columns ALL - generate for all columns TIME - generate for only time column NON_METRIC - generate for time and dimension columns

nullHandlingEnabled

Boolean to indicate whether to keep track of null values as part of the segment generation. This is required when using IS NULL or IS NOT NULL predicates in the query. Enabling this will lead to additional memory and storage usage per segment. By default, this is set to false.

aggregateMetrics

optimizeDictionaryForMetrics

Set to true if you want to disable dictionaries for single valued metric columns. Only applicable to single-valued metric columns. If a column is specified Default false

noDictionarySizeRatioThreshold

If optimizeDictionaryForMetrics enabled, dictionary is not created for the metric columns for which noDictionaryIndexSize/ indexWithDictionarySize is less than the noDictionarySizeRatioThreshold Default: 0.85

segmentNameGeneratorType

Type of segmentNameGenerator, default is SimpleSegmentNameGenerator.

Field Config List

Specify the columns and the type of indices to be created on those columns. Currently, not all index types can use this property. The following indexes are supported:

Property

name

Name of the column

encodingType

Should be one of RAW or DICTIONARY

indexTypes

List of indexes to create on this column. Valid values are the ids of the index types (text, fst, h3, etc)

properties

JSON of key-value pairs containing additional properties associated with the index. The following properties are supported currently -

  • enableQueryCacheForTextIndex - set to true to enable caching for text index in Lucene

  • luceneMaxBufferSizeMB - Lucene IndexWriter buffer max size, defaults to 500

  • luceneUseCompoundFile - Lucene IndexWriter file format, defaults to true to use compound files

  • rawIndexWriterVersion

  • deriveNumDocsPerChunkForRawIndex

  • forwardIndexDisabled - set to true to disable the forward index, defaults to false

The property indexType (in singular, accepting a single index id as string) is also supported for compatibility reasons, but we recommend using the plural in order to be able to define several indexes for the same column.

Warning:

If removing the forwardIndexDisabled property above to regenerate the forward index for multi-value (MV) columns note that the following invariants cannot be maintained after regenerating the forward index for a forward index disabled column:

  • Ordering guarantees of the MV values within a row

  • If entries within an MV row are duplicated, the duplicates will be lost. Regenerate the segments via your offline jobs and re-push / refresh the data to get back the original MV data with duplicates.

We will work on removing the second invariant in the future.

Real-time table config

The sections below apply to real-time tables only.

segmentsConfig

Property
Description

replicasPerPartition

The number of replicas per partition for the stream

completionMode

determines if segment should be downloaded from other server or built in memory. can be DOWNLOAD or empty

peerSegmentDownloadScheme

protocol to use to download segments from server. can be on of http or https

Indexing config

Tenants

Property
Description

broker

Broker tenant in which the segment should reside

server

Server tenant in which the segment should reside

tagOverrideConfig

Override the tenant for segment if it fulfills certain conditions. Currently, only support override on realtimeConsuming or realtimeCompleted

Example

  "broker": "brokerTenantName",
  "server": "serverTenantName",
  "tagOverrideConfig" : {
    "realtimeConsuming" : "serverTenantName_REALTIME"
    "realtimeCompleted" : "serverTenantName_OFFLINE"
  }
}

Environment variables override

Pinot allows users to define environment variables in the format of ${ENV_NAME} or ${ENV_NAME:DEFAULT_VALUE}as field values in table config.

Pinot instance will override it during runtime.

Brackets are required when defining the environment variable."$ENV_NAME"is not supported.

Environment variables used without default value in table config have to be available to all Pinot components - Controller, Broker, Server, and Minion. Otherwise, querying/consumption will be affected depending on the service to which these variables are not available.

Below is an example of setting AWS credential as part of table config using environment variable.

Example:

{
...
  "ingestionConfig": {
    "batchIngestionConfig": {
      "segmentIngestionType": "APPEND",
      "segmentIngestionFrequency": "DAILY",
      "batchConfigMaps": [
        {
          "inputDirURI": "s3://<my-bucket>/baseballStats/rawdata",
          "includeFileNamePattern": "glob:**/*.csv",
          "excludeFileNamePattern": "glob:**/*.tmp",
          "inputFormat": "csv",
          "outputDirURI": "s3://<my-bucket>/baseballStats/segments",
          "input.fs.className": "org.apache.pinot.plugin.filesystem.S3PinotFS",
          "input.fs.prop.region": "us-west-2",
          "input.fs.prop.accessKey": "${AWS_ACCESS_KEY}",
          "input.fs.prop.secretKey": "${AWS_SECRET_KEY}",
          "push.mode": "tar"
        }
      ],
      "segmentNameSpec": {},
      "pushSpec": {}
    }
  },
...
}

Sample configurations

Offline table

pinot-table-offline.json
"OFFLINE": {
    "tableName": "pinotTable",
    "tableType": "OFFLINE",
    "quota": {
      "maxQueriesPerSecond": 300,
      "storage": "140G"
    },
    "routing": {
      "segmentPrunerTypes": ["partition"],
      "instanceSelectorType": "replicaGroup"
    },
    "segmentsConfig": {
      "schemaName": "pinotTable",
      "timeColumnName": "daysSinceEpoch",
      "timeType": "DAYS",
      "replication": "3",
      "retentionTimeUnit": "DAYS",
      "retentionTimeValue": "365",
      "segmentPushFrequency": "DAILY",
      "segmentPushType": "APPEND"
    },
    "tableIndexConfig": {
      "invertedIndexColumns": ["foo", "bar", "moo"],
      "createInvertedIndexDuringSegmentGeneration": false,
      "sortedColumn": ["pk"],
      "bloomFilterColumns": [],
      "starTreeIndexConfigs": [],
      "noDictionaryColumns": [],
      "rangeIndexColumns": [],
      "onHeapDictionaryColumns": [],
      "varLengthDictionaryColumns": [],
      "segmentPartitionConfig": {
        "columnPartitionMap": {
          "column_foo": {
          "functionName": "Murmur",
          "numPartitions": 32
        }
      }
      "loadMode": "MMAP",
      "columnMinMaxValueGeneratorMode": null,
      "nullHandlingEnabled": false
    },
    "tenants": {
      "broker": "myBrokerTenant",
      "server": "myServerTenant"
    },
    "ingestionConfig": {
      "filterConfig": {
        "filterFunction": "Groovy({foo == \"VALUE1\"}, foo)"
      },
      "transformConfigs": [{
        "columnName": "bar",
        "transformFunction": "lower(moo)"
      },
      {
        "columnName": "hoursSinceEpoch",
        "transformFunction": "toEpochHours(millis)"
      }]
    }
    "metadata": {
      "customConfigs": {
        "key": "value",
        "key": "value"
      }
    }
  }
}

Real-time table

Here's an example table config for a real-time table. All the fields from the offline table config are valid for the real-time table. Additionally, real-time tables use some extra fields.

pinot-table-realtime.json
"REALTIME": {
    "tableName": "pinotTable",
    "tableType": "REALTIME",
    "segmentsConfig": {
      "schemaName": "pinotTable",
      "timeColumnName": "daysSinceEpoch",
      "timeType": "DAYS",
      "replicasPerPartition": "3",
      "retentionTimeUnit": "DAYS",
      "retentionTimeValue": "5",
      "segmentPushType": "APPEND",
      "completionConfig": {
        "completionMode": "DOWNLOAD"
      }
    },
    "tableIndexConfig": {
      "invertedIndexColumns": ["foo", "bar", "moo"],
      "sortedColumn": ["column1"],
      "noDictionaryColumns": ["metric1", "metric2"],
      "loadMode": "MMAP",
      "nullHandlingEnabled": false,
    },
    "ingestionConfig:" {
      "streamIngestionConfig": {
       "streamConfigMaps":[
        { "realtime.segment.flush.threshold.rows": "0",
        "realtime.segment.flush.threshold.time": "24h",
        "realtime.segment.flush.threshold.segment.size": "150M",
        "stream.kafka.broker.list": "XXXX",
        "stream.kafka.consumer.factory.class.name": "XXXX",
        "stream.kafka.consumer.prop.auto.offset.reset": "largest",
        "stream.kafka.consumer.type": "XXXX",
        "stream.kafka.decoder.class.name": "XXXX",
        "stream.kafka.decoder.prop.schema.registry.rest.url": "XXXX",
        "stream.kafka.decoder.prop.schema.registry.schema.name": "XXXX",
        "stream.kafka.hlc.zk.connect.string": "XXXX",
        "stream.kafka.topic.name": "XXXX",
        "stream.kafka.zk.broker.url": "XXXX",
        "streamType": "kafka"
      }
    ]
    },
    "tenants":{
      "broker": "myBrokerTenant",
      "server": "myServerTenant",
      "tagOverrideConfig": {}
    },
    "metadata": {}
}
PreviousServerNextSchema

Last updated 1 year ago

Was this helpful?

Boolean field to indicate whether the table is a

Defines properties related to quotas, such as storage quota and query quota. For details, see the table below.

Defines the enabled minion tasks for the table. See for more details.

Defines the properties that determine how the broker selects the servers to route, and how segments can be pruned by the broker based on segment metadata. For details, see the table below.

Defines the properties related to query execution. For details, see the table below.

Defines the properties related to the segments of the table, such as segment push frequency, type, retention, schema, time column etc. For details, see the table below.

Defines the indexing related information for the Pinot table. For details, see below.

Specifies the columns and the type of indices to be created on those columns. See for sub-properties.

Defines the server and broker tenant used for this table. For details, see below.

Defines the configurations needed for ingestion level transformations. For details, see and .

Set upset configurations. For details, see .

Set deduplication configurations. For details, see .

Defines configurations for tiered storage. For details, see .

Find details on configuring routing .

The list of columns to apply bloom filter on. The names of the columns should match the schema. For more details about using bloom filters refer to .

The map from the column to the bloom filter config. The names of the columns should match the schema. For more details about using bloom filters refer to .

The list of StarTree indexing configs for creating StarTree indexes. For details on how to configure this, see .

Boolean to indicate whether to create a default StarTree index for the segment. For details, see .

The set of columns that should not be dictionary-encoded. The name of columns should match the schema. NoDictionary dimension columns are compressed, while the metrics are not compressed.

The list of columns to create the JSON index. See for more details.

The map from column to JSON index config. See for more details.

Use segmentPartitionConfig.columnPartitionMap along with to enable partitioning. For each column, configure the following options:

(deprecated, use ) (only applicable for stream) set to true to pre-aggregate the metrics

See more on

.

.

.

The streamConfigs section has been deprecated as of release 0.7.0. See instead.

here
Text
FST
Timestamp
H3 (also known as geospatial)
streamConfigMaps
dimension table
Minion
Tenant
Ingestion Level Transformations
Ingestion Level Aggregations
Stream ingestion with upsert
Stream ingestion with Dedup
Tiered Storage
Bloom Filter
Bloom Filter
StarTree Index
StarTree Index
LZ4
Ingestion Aggregation
Quota
Routing
Query
segmentsConfig
Table indexing config
Field config list
routing.segementPrunerTypes
Segment Name Generator Spec
JSON Index
JSON Index