1 of 6

Frequently Asked Questions (FAQs)

This page has a collection of frequently asked questions with answers from the community.

This is a list of frequent questions most often asked in our troubleshooting channel on Slack. Please feel free to contribute your questions and answers here and make a pull request.

General

FAQ for general questions around Pinot

How does Pinot use deep storage?

When data is pushed in to Pinot, it makes a backup copy of the data and stores it on the configured deep-storage (S3/GCP/ADLS/NFS/etc). This copy is stored as tar.gz Pinot segments. Note, that pinot servers keep a (untarred) copy of the segments on their local disk as well. This is done for performance reasons.

Pinot On Kubernetes FAQ

How to increase server disk size on AWS

Below is an example of AWS EKS.

1. Update Storage Class

In the K8s cluster, check the storage class: in AWS, it should be gp2.

Then update StorageClass to ensure:

Once StorageClass is updated, it should be like:

2. Update PVC

Once the storage class is updated, then we can update PVC for the server disk size.

Now we want to double the disk size for pinot-server-3.

Below is an example of current disks:

Below is the output of data-pinot-server-3

Now, let's change the PVC size to 2T by editing the server PVC.

Once updated, the spec's PVC size is updated to 2T, but the status's PVC size is still 1T.

3. Restart pod to let it reflect

Restart pinot-server-3 pod:

Recheck PVC size:

Ingestion FAQ

Data processing

What is a good segment size?

While Pinot can work with segments of various sizes, for optimal use of Pinot, you want to get your segments sized in the 100MB to 500MB (un-tarred/uncompressed) range. Please note that having too many (thousands or more) of tiny segments for a single table just creates more overhead in terms of the metadata storage in Zookeeper as well as in the Pinot servers' heap. At the same time, having too few really large (GBs) segments reduces parallelism of query execution, as on the server side, the thread parallelism of query execution is at segment level.

Query FAQ

Querying

I get the following error when running a query, what does it mean?

This essentially implies that the Pinot Broker assigned to the table specified in the query was not found. A common root cause for this is a typo in the table name in the query. Another uncommon reason could be if there wasn't actually a broker with required broker tenant tag for the table.

What are all the fields in the Pinot query's JSON response?

Here's the page explaining the Pinot response format:

SQL Query fails with "Encountered 'timestamp' was expecting one of..."

"timestamp" is a reserved keyword in SQL. Escape timestamp with double quotes.

Other commonly encountered reserved keywords are date, time, table.

Filtering on STRING column WHERE column = "foo" does not work?

For filtering on STRING columns, use single quotes

ORDER BY using an alias doesn't work?

The fields in the ORDER BY clause must be one of the group by clauses or aggregations, BEFORE applying the alias. Therefore, this will not work

Instead, this will work

Does pagination work in GROUP BY queries?

No. Pagination only works for SELECTION queries

How do I increase timeout for a query ?

You can add this at the end of your query: option(timeoutMs=X). For eg: the following example will use a timeout of 20 seconds for the query:

You can also use SET "timeoutMs" = 20000; SELECT COUNT(*) from myTable

For changing timeout on the entire cluster, set this property pinot.broker.timeoutMs in either broker configs or cluster configs (using POST /cluster/configs API from swagger)

How do I cancel a query?

Add these two configs for Pinot server and broker to start tracking of running queries. The query tracks are added and cleaned as query starts and ends, so should not consume much resource.

Then use the Rest APIs on Pinot controller to list running queries and cancel them via the query ID and broker ID (as query ID is only local to broker), like below:

How do I optimize my Pinot table for doing aggregations and group-by on high cardinality columns ?

In order to speed up aggregations, you can enable metrics aggregation on the required column by adding a in the corresponding schema and setting aggregateMetrics to true in the table config. You can also use a star-tree index config for such columns ()

How do I verify that an index is created on a particular column ?

There are 2 ways to verify this:

Log in to a server that hosts segments of this table. Inside the data directory, locate the segment directory for this table. In this directory, there is a file named index_map which lists all the indexes and other data structures created for each segment. Verify that the requested index is present here.
During query: Use the column in the filter predicate and check the value of numEntriesScannedInFilter . If this value is 0, then indexing is working as expected (works for Inverted index)

Does Pinot use a default value for LIMIT in queries?

Yes, Pinot uses a default value of LIMIT 10 in queries. The reason behind this default value is to avoid unintentionally submitting expensive queries that end up fetching or processing a lot of data from Pinot. Users can always overwrite this by explicitly specifying a LIMIT value.

Does Pinot cache query results?

Pinot does not cache query results, each query is computed in its entirety. Note though, running the same or similar query multiple times will naturally pull in segment pages into memory making subsequent calls faster. Also, for realtime systems, the data is changing in realtime, so results cannot be cached. For offline-only systems, caching layer can be built on top of Pinot, with invalidation mechanism built-in to invalidate the cache when data is pushed into Pinot.

I'm noticing that the first query is slower than subsequent queries, why is that?

Pinot memory maps segments. It warms up during the first query, when segments are pulled into the memory by the OS. Subsequent queries will have the segment already loaded in memory, and hence will be faster. The OS is responsible for bringing the segments into memory, and also removing them in favor of other segments when other segments not already in memory are accessed.

How do I determine if StarTree index is being used for my query?

The query execution engine will prefer to use StarTree index for all queries where it can be used. The criteria to determine whether StarTree index can be used is as follows:

All aggregation function + column pairs in the query must exist in the StarTree index.
All dimensions that appear in filter predicates and group-by should be StarTree dimensions.

For queries where above is true, StarTree index is used. For other queries, the execution engine will default to using the next best index available.

Operations FAQ

Memory

How much heap should I allocate for my Pinot instances?

Typically, Pinot components try to use as much off-heap (MMAP/DirectMemory) wherever possible. For example, Pinot servers load segments in memory-mapped files in MMAP mode (recommended), or direct memory in HEAP mode. Heap memory is used mostly for query execution and storing some metadata. We have seen production deployments with high throughput and low-latency work well with just 16 GB of heap for Pinot servers and brokers. Pinot controller may also cache some metadata (table configs etc) in heap, so if there are just a few tables in the Pinot cluster, a few GB of heap should suffice.

DR

Does Pinot provide any backup/restore mechanism?

Pinot relies on deep-storage for storing backup copy of segments (offline as well as realtime). It relies on Zookeeper to store metadata (table configs, schema, cluster state, etc). It does not explicitly provide tools to take backups or restore these data, but relies on the deep-storage (ADLS/S3/GCP/etc), and ZK to persist these data/metadata.

Alter Table

Can I change a column name in my table, without losing data?

Changing a column name or data type is considered backward incompatible change. While Pinot does support schema evolution for backward compatible changes, it does not support backward incompatible changes like changing name/data-type of a column.

How to change number of replicas of a table?

You can change the number of replicas by updating the table config's section. Make sure you have at least as many servers as the replication.

For OFFLINE table, update

For REALTIME table update

After changing the replication, run a .

Rebalance

How to run a rebalance on a table?

Refer to .

Why does my REALTIME table not use the new nodes I added to the cluster?

Likely explanation: num partitions * num replicas < num servers

In realtime tables, segments of the same partition always continue to remain on the same node. This sticky assignment is needed for replica groups and is critical if using upserts. For instance, if you have 3 partitions, 1 replica, and 4 nodes, only ¾ nodes will be used, and all of p0 segments will be on 1 node, p1 on 1 node, and p2 on 1 node. One server will be unused, and will remain unused through rebalances.

There’s nothing we can do about CONSUMING segments, they will continue to use only 3 nodes if you have 3 partitions. But we can rebalance such that completed segments use all nodes. If you want to force the completed segments of the table to use the new server, use this config

Segments

How to control number of segments generated?

The number of segments generated depends on the number of input files. If you provide only 1 input file, you will get 1 segment. If you break up the input file into multiple files, you will get as many segments as the input files.

What are the common reasons my segment is in a BAD state ?

This typically happens when the server is unable to load the segment. Possible causes: Out-Of-Memory, no-disk space, unable to download segment from deep-store, and similar other errors. Please check server logs for more information.

How to reset a segment when it runs into a BAD state?

Use the segment reset controller REST API to reset the segment:

How to pause realtime ingestion?

Refer to .

What's the difference to Reset, Refresh, or Reload a segment?

RESET: this gets a segment in ERROR state back to ONLINE or CONSUMING state. Behind the scenes, Pinot controller takes the segment to OFFLINE state, waits for External View to stabilize, and then moves it back to ONLINE/CONSUMING state, thus effectively resetting segments or consumers in error states.

REFRESH: this replaces the segment with a new one, with the same name but often different data. Under the hood, Pinot controller sets new segment metadata in Zookeeper, and notifies brokers and servers to check their local states about this segment and update accordingly. Servers also download the new segment to replace the old one, when both have different checksums. There is no separate rest API for refreshing, and it is done as part of SegmentUpload API today.

RELOAD: this reloads the segment, often to generate a new index as updated in table config. Underlying, Pinot server gets the new table config from Zookeeper, and uses it to guide the segment reloading. In fact, the last step of REFRESH as explained above is to load the segment into memory to serve queries. There is a dedicated rest API for reloading. By default, it doesn't download segment. But option is provided to force server to download segment to replace the local one cleanly.

In addition, RESET brings the segment OFFLINE temporarily; while REFRESH and RELOAD swap the segment on server atomically without bringing down the segment or affecting ongoing queries.

Tenants

How can I make brokers/servers join the cluster without the DefaultTenant tag?

Set this property in your controller.conf file

Now your brokers and servers should join the cluster as broker_untagged and server_untagged . You can then directly use the POST /tenants API to create the desired tenants

Minion

How to tune minion task timeout and parallelism on each worker

There are two task configs but set as part of cluster configs like below. One controls task's overall timeout (1hr by default) and one for how many tasks to run on a single minion worker (1 by default). The <taskType> is the task to tune, e.g. MergeRollupTask or RealtimeToOfflineSegmentsTask etc.

How to I manually run a Periodic Task

Refer to

Tuning and Optimizations

Do replica groups work for real-time?

Yes, replica groups work for realtime. There's 2 parts to enabling replica groups:

Replica groups segment assignment
Replica group query routing

Replica group segment assignment

Replica group segment assignment is achieved in realtime, if number of servers is a multiple of number of replicas. The partitions get uniformly sprayed across the servers, creating replica groups. For example, consider we have 6 partitions, 2 replicas, and 4 servers.

As you can see, the set (S0, S2) contains r1 of every partition, and (s1, S3) contains r2 of every partition. The query will only be routed to one of the sets, and not span every server. If you are are adding/removing servers from an existing table setup, you have to run for segment assignment changes to take effect.

Replica group query routing

Once replica group segment assignment is in effect, the query routing can take advantage of it. For replica group based query routing, set the following in the table config's section, and then restart brokers

Credential

How to update credential for realtime upstream without downtime

Wait for the pause status to success
Update the credential in the table config

Query FAQ

Querying

I get the following error when running a query, what does it mean?

{'errorCode': 410, 'message': 'BrokerResourceMissingError'}

What are all the fields in the Pinot query's JSON response?

Here's the page explaining the Pinot response format:

SQL Query fails with "Encountered 'timestamp' was expecting one of..."

"timestamp" is a reserved keyword in SQL. Escape timestamp with double quotes.

Other commonly encountered reserved keywords are date, time, table.

Filtering on STRING column WHERE column = "foo" does not work?

For filtering on STRING columns, use single quotes

ORDER BY using an alias doesn't work?

The fields in the ORDER BY clause must be one of the group by clauses or aggregations, BEFORE applying the alias. Therefore, this will not work

Instead, this will work

Does pagination work in GROUP BY queries?

No. Pagination only works for SELECTION queries

How do I increase timeout for a query ?

You can add this at the end of your query: option(timeoutMs=X). For eg: the following example will use a timeout of 20 seconds for the query:

You can also use SET "timeoutMs" = 20000; SELECT COUNT(*) from myTable

For changing timeout on the entire cluster, set this property pinot.broker.timeoutMs in either broker configs or cluster configs (using POST /cluster/configs API from swagger)

How do I cancel a query?

Add these two configs for Pinot server and broker to start tracking of running queries. The query tracks are added and cleaned as query starts and ends, so should not consume much resource.

Then use the Rest APIs on Pinot controller to list running queries and cancel them via the query ID and broker ID (as query ID is only local to broker), like below:

How do I optimize my Pinot table for doing aggregations and group-by on high cardinality columns ?

How do I verify that an index is created on a particular column ?

There are 2 ways to verify this:

Log in to a server that hosts segments of this table. Inside the data directory, locate the segment directory for this table. In this directory, there is a file named index_map which lists all the indexes and other data structures created for each segment. Verify that the requested index is present here.
During query: Use the column in the filter predicate and check the value of numEntriesScannedInFilter . If this value is 0, then indexing is working as expected (works for Inverted index)

Does Pinot use a default value for LIMIT in queries?

Does Pinot cache query results?

I'm noticing that the first query is slower than subsequent queries, why is that?

How do I determine if StarTree index is being used for my query?

The query execution engine will prefer to use StarTree index for all queries where it can be used. The criteria to determine whether StarTree index can be used is as follows:

All aggregation function + column pairs in the query must exist in the StarTree index.
All dimensions that appear in filter predicates and group-by should be StarTree dimensions.

For queries where above is true, StarTree index is used. For other queries, the execution engine will default to using the next best index available.

Operations FAQ

Memory

How much heap should I allocate for my Pinot instances?

DR

Does Pinot provide any backup/restore mechanism?

Alter Table

Can I change a column name in my table, without losing data?

How to change number of replicas of a table?

You can change the number of replicas by updating the table config's section. Make sure you have at least as many servers as the replication.

For OFFLINE table, update

For REALTIME table update

After changing the replication, run a .

Rebalance

How to run a rebalance on a table?

Refer to .

Why does my REALTIME table not use the new nodes I added to the cluster?

Likely explanation: num partitions * num replicas < num servers

Segments

How to control number of segments generated?

What are the common reasons my segment is in a BAD state ?

How to reset a segment when it runs into a BAD state?

Use the segment reset controller REST API to reset the segment:

How to pause realtime ingestion?

Refer to .

What's the difference to Reset, Refresh, or Reload a segment?

In addition, RESET brings the segment OFFLINE temporarily; while REFRESH and RELOAD swap the segment on server atomically without bringing down the segment or affecting ongoing queries.

Tenants

How can I make brokers/servers join the cluster without the DefaultTenant tag?

Set this property in your controller.conf file

Now your brokers and servers should join the cluster as broker_untagged and server_untagged . You can then directly use the POST /tenants API to create the desired tenants

Minion

How to tune minion task timeout and parallelism on each worker

How to I manually run a Periodic Task

Refer to

Tuning and Optimizations

Do replica groups work for real-time?

Yes, replica groups work for realtime. There's 2 parts to enabling replica groups:

Replica groups segment assignment
Replica group query routing

Replica group segment assignment

Replica group query routing

Credential

How to update credential for realtime upstream without downtime

Wait for the pause status to success
Update the credential in the table config

Frequently Asked Questions (FAQs)

General

hashtagHow does Pinot use deep storage?

Pinot On Kubernetes FAQ

hashtagHow to increase server disk size on AWS

hashtag1. Update Storage Class

hashtag2. Update PVC

hashtag3. Restart pod to let it reflect

Ingestion FAQ

hashtagData processing

hashtagWhat is a good segment size?

Query FAQ

hashtagQuerying

hashtagI get the following error when running a query, what does it mean?

hashtagWhat are all the fields in the Pinot query's JSON response?

hashtagSQL Query fails with "Encountered 'timestamp' was expecting one of..."

hashtagFiltering on STRING column WHERE column = "foo" does not work?

hashtagORDER BY using an alias doesn't work?

hashtagDoes pagination work in GROUP BY queries?

hashtagHow do I increase timeout for a query ?

hashtagHow do I cancel a query?

hashtagHow do I optimize my Pinot table for doing aggregations and group-by on high cardinality columns ?

hashtagHow do I verify that an index is created on a particular column ?

hashtagDoes Pinot use a default value for LIMIT in queries?

hashtagDoes Pinot cache query results?

hashtagI'm noticing that the first query is slower than subsequent queries, why is that?

hashtagHow do I determine if StarTree index is being used for my query?

Operations FAQ

hashtagMemory

hashtagHow much heap should I allocate for my Pinot instances?

hashtagDR

hashtagDoes Pinot provide any backup/restore mechanism?

hashtagAlter Table

hashtagCan I change a column name in my table, without losing data?

hashtagHow to change number of replicas of a table?

hashtagRebalance

hashtagHow to run a rebalance on a table?

hashtagWhy does my REALTIME table not use the new nodes I added to the cluster?

hashtagSegments

hashtagHow to control number of segments generated?

hashtagWhat are the common reasons my segment is in a BAD state ?

hashtagHow to reset a segment when it runs into a BAD state?

hashtagHow to pause realtime ingestion?

hashtagWhat's the difference to Reset, Refresh, or Reload a segment?

hashtagTenants

hashtagHow can I make brokers/servers join the cluster without the DefaultTenant tag?

hashtagMinion

hashtagHow to tune minion task timeout and parallelism on each worker

hashtagHow to I manually run a Periodic Task

hashtagTuning and Optimizations

hashtagDo replica groups work for real-time?

hashtagCredential

hashtagHow to update credential for realtime upstream without downtime

Pinot On Kubernetes FAQ

hashtagHow to increase server disk size on AWS

hashtag1. Update Storage Class

hashtag2. Update PVC

hashtag3. Restart pod to let it reflect

Ingestion FAQ

hashtagData processing

hashtagWhat is a good segment size?

hashtagCan multiple Pinot tables consume from the same Kafka topic?

hashtagIf I add a partition to a Kafka topic, will Pinot automatically ingest data from this partition?

hashtagHow do I enable partitioning in Pinot, when using Kafka stream?

hashtagHow do I store BYTES column in JSON data?

hashtagHow do I flatten my JSON Kafka stream?

hashtagHow do I escape Unicode in my Job Spec YAML file?

hashtagIs there a limit on the maximum length of a string column in Pinot?

hashtagWhen can new events become queryable when getting ingested into a real-time table?

hashtagHow to reset a CONSUMING segment stuck on an offset which has expired from the stream?

hashtagIndexing

hashtagHow to set inverted indexes?

hashtagHow to apply an inverted index to existing segments?

hashtagCan I retrospectively add an index to any segment?

hashtagHow to create star-tree indexes?

hashtagHandling time in Pinot

hashtagHow does Pinot’s real-time ingestion handle out-of-order events?

hashtagWhat is the purpose of a hybrid table not using max(OfflineTime) to determine the time-boundary, and instead using an offset?

hashtagWhy are segments not strictly time-partitioned?

General

How does Pinot use deep storage?

How to increase server disk size on AWS

1. Update Storage Class

2. Update PVC

3. Restart pod to let it reflect

Data processing

What is a good segment size?

Querying

I get the following error when running a query, what does it mean?

What are all the fields in the Pinot query's JSON response?

SQL Query fails with "Encountered 'timestamp' was expecting one of..."

Filtering on STRING column WHERE column = "foo" does not work?

ORDER BY using an alias doesn't work?

Does pagination work in GROUP BY queries?

How do I increase timeout for a query ?

How do I cancel a query?

How do I optimize my Pinot table for doing aggregations and group-by on high cardinality columns ?

How do I verify that an index is created on a particular column ?

Does Pinot use a default value for LIMIT in queries?

Does Pinot cache query results?

I'm noticing that the first query is slower than subsequent queries, why is that?

How do I determine if StarTree index is being used for my query?

Memory

How much heap should I allocate for my Pinot instances?

DR

Does Pinot provide any backup/restore mechanism?

Alter Table

Can I change a column name in my table, without losing data?

How to change number of replicas of a table?

Rebalance

How to run a rebalance on a table?

Why does my REALTIME table not use the new nodes I added to the cluster?

Segments

How to control number of segments generated?

What are the common reasons my segment is in a BAD state ?

How to reset a segment when it runs into a BAD state?

How to pause realtime ingestion?

What's the difference to Reset, Refresh, or Reload a segment?

Tenants

How can I make brokers/servers join the cluster without the DefaultTenant tag?

Minion

How to tune minion task timeout and parallelism on each worker

How to I manually run a Periodic Task

Tuning and Optimizations

Do replica groups work for real-time?

Credential

How to update credential for realtime upstream without downtime

How to increase server disk size on AWS

1. Update Storage Class

2. Update PVC

3. Restart pod to let it reflect

Data processing

What is a good segment size?

Can multiple Pinot tables consume from the same Kafka topic?

If I add a partition to a Kafka topic, will Pinot automatically ingest data from this partition?

How do I enable partitioning in Pinot, when using Kafka stream?

How do I store BYTES column in JSON data?

How do I flatten my JSON Kafka stream?

How do I escape Unicode in my Job Spec YAML file?

Is there a limit on the maximum length of a string column in Pinot?

When can new events become queryable when getting ingested into a real-time table?

How to reset a CONSUMING segment stuck on an offset which has expired from the stream?

Indexing

How to set inverted indexes?

How to apply an inverted index to existing segments?

Can I retrospectively add an index to any segment?

How to create star-tree indexes?

Handling time in Pinot

How does Pinot’s real-time ingestion handle out-of-order events?

What is the purpose of a hybrid table not using `max(OfflineTime)` to determine the time-boundary, and instead using an offset?

Why are segments not strictly time-partitioned?

How does Pinot use deep storage?

Why am I getting "Could not find or load class" error when running Quickstart using 0.8.0 release?

Querying

I get the following error when running a query, what does it mean?

What are all the fields in the Pinot query's JSON response?

SQL Query fails with "Encountered 'timestamp' was expecting one of..."

Filtering on STRING column WHERE column = "foo" does not work?

ORDER BY using an alias doesn't work?

Does pagination work in GROUP BY queries?