1 of 6

Frequently Asked Questions (FAQs)

This page lists pages with frequently asked questions with answers from the community.

This is a list of questions frequently asked in our troubleshooting channel on Slack. To contribute additional questions and answers, .

General

This page has a collection of frequently asked questions of a general nature with answers from the community.

This is a list of questions frequently asked in our troubleshooting channel on Slack. To contribute additional questions and answers, make a pull request.

How does Apache Pinot use deep storage?

When data is pushed to Apache Pinot, Pinot makes a backup copy of the data and stores it on the configured deep-storage (S3/GCP/ADLS/NFS/etc). This copy is stored as tar.gz Pinot segments. Note, that Pinot servers keep a (untarred) copy of the segments on their local disk as well. This is done for performance reasons.

How does Pinot use Zookeeper?

Pinot uses Apache Helix for cluster management, which in turn is built on top of Zookeeper. Helix uses Zookeeper to store the cluster state, including Ideal State, External View, Participants, and so on. Pinot also uses Zookeeper to store information such as Table configurations, schemas, Segment Metadata, and so on.

Why am I getting "Could not find or load class" error when running Quickstart using 0.8.0 release?

Check the JDK version you are using. You may be getting this error if you are using an older version than the current Pinot binary release was built on. If so, you have two options: switch to the same JDK release as Pinot was built with or download the for the Pinot release and it locally.

How to change TimeZone when running Pinot?

There are 2 ways to do it:

Setting an environment variable: TZ=UTC.

E.g.

Setting JVM argument: user.timezone

TODO:

Plan to add a configuration to change time zone using cluster config or pinot component config

Pinot On Kubernetes FAQ

This page has a collection of frequently asked questions about Pinot on Kubernetes with answers from the community.

This is a list of questions frequently asked in our troubleshooting channel on Slack. To contribute additional questions and answers, make a pull request.

How to increase server disk size on AWS

The following is an example using Amazon Elastic Kubernetes Service (Amazon EKS).

1. Update Storage Class

In the Kubernetes (k8s) cluster, check the storage class: in Amazon EKS, it should be gp2.

Then update StorageClass to ensure:

Once StorageClass is updated, it should look like this:

2. Update PVC

Once the storage class is updated, then we can update the PersistentVolumeClaim (PVC) for the server disk size.

Now we want to double the disk size for pinot-server-3.

The following is an example of current disks:

The following is the output of data-pinot-server-3:

Now, let's change the PVC size to 2T by editing the server PVC.

Once updated, the specification's PVC size is updated to 2T, but the status's PVC size is still 1T.

3. Restart pod to let it reflect

Restart the pinot-server-3 pod:

Recheck the PVC size:

Ingestion FAQ

This page has a collection of frequently asked questions about ingestion with answers from the community.

This is a list of questions frequently asked in our troubleshooting channel on Slack. To contribute additional questions and answers, make a pull request.

Data processing

What is a good segment size?

While Apache Pinot can work with segments of various sizes, for optimal use of Pinot, you want to get your segments sized in the 100MB to 500MB (un-tarred/uncompressed) range. Having too many (thousands or more) tiny segments for a single table creates overhead in terms of the metadata storage in Zookeeper as well as in the Pinot servers' heap. At the same time, having too few really large (GBs) segments reduces parallelism of query execution, as on the server side, the thread parallelism of query execution is at segment level.

Can multiple Pinot tables consume from the same Kafka topic?

Yes. Each table can be independently configured to consume from any given Kafka topic, regardless of whether there are other tables that are also consuming from the same Kafka topic.

If I add a partition to a Kafka topic, will Pinot automatically ingest data from this partition?

Pinot automatically detects new partitions in Kafka topics. It checks for new partitions whenever RealtimeSegmentValidationManager periodic job runs and starts consumers for new partitions.

You can configure the interval for this job using thecontroller.realtime.segment.validation.frequencyPeriod property in the controller configuration.

Does Pinot support partition pruning on multiple partition columns?

Pinot supports multi-column partitioning for offline tables. Map multiple columns under Pinot assigns the input data to each partition according to the partition configuration individually for each column.

The following example partitions the segment based on two columns, memberID and caseNumber. Note that each partition column is handled separately, so in this case the segment is partitioned on memberID (partition ID 1) and also partiitoned on caseNumber (partition ID 2).

For multi-column partitioning to work, you must also set routing.segementPrunerTypes as follows:

How do I enable partitioning in Pinot when using Kafka stream?

Set up partitioner in the Kafka producer:

The partitioning logic in the stream should match the partitioning config in Pinot. Kafka uses murmur2, and the equivalent in Pinot is the Murmur function.

Set the partitioning configuration as below using same column used in Kafka:

and also set:

To learn how partition works, see .

How do I store BYTES column in JSON data?

For JSON, you can use a hex encoded string to ingest BYTES.

How do I flatten my JSON Kafka stream?

See the function which can store a top level json field as a STRING in Pinot.

Then you can use these during query time, to extract fields from the json string.

NOTE This works well if some of your fields are nested json, but most of your fields are top level json keys. If all of your fields are within a nested JSON key, you will have to store the entire payload as 1 column, which is not ideal.

How do I escape Unicode in my Job Spec YAML file?

To use explicit code points, you must double-quote (not single-quote) the string, and escape the code point via "\uHHHH", where HHHH is the four digit hex code for the character. See for more details.

Is there a limit on the maximum length of a string column in Pinot?

By default, Pinot limits the length of a String column to 512 bytes. If you want to overwrite this value, you can set the maxLength attribute in the schema as follows:

When are new events queryable when getting ingested into a real-time table?

Events are available to queries as soon as they are ingested. This is because events are instantly indexed in memory upon ingestion.

The ingestion of events into the real-time table is not transactional, so replicas of the open segment are not immediately consistent. Pinot trades consistency for availability upon network partitioning (CAP theorem) to provide ultra-low ingestion latencies at high throughput.

However, when the open segment is closed and its in-memory indexes are flushed to persistent storage, all its replicas are guaranteed to be consistent, with the .

How to reset a CONSUMING segment stuck on an offset which has expired from the stream?

This typically happens if:

The consumer is lagging a lot.
The consumer was down (server down, cluster down), and the stream moved on, resulting in offset not found when consumer comes back up.

In case of Kafka, to recover, set property "auto.offset.reset":"earliest" in the streamConfigs section and reset the CONSUMING segment. See for more details about the configuration.

You can also also use the "Resume Consumption" endpoint with "resumeFrom" parameter set to "smallest" (or "largest" if you want). See for more details.

Indexing

How to set inverted indexes?

Inverted indexes are set in the tableConfig's tableIndexConfig -> invertedIndexColumns list. For more info on table configuration, see . For an example showing how to configure an inverted index, see .

Applying inverted indexes to a table configuration will generate an inverted index for all new segments. To apply the inverted indexes to all existing segments, see

How to apply an inverted index to existing segments?

Add the columns you want to index to the tableIndexConfig-> invertedIndexColumns list. To update the table configuration use the Pinot Swagger API: .
Invoke the reload API: .

Once you've done that, you can check whether the index has been applied by querying the segment metadata API at . Don't forget to include the names of the column on which you have applied the index.

The output from this API should look something like the following:

Can I retrospectively add an index to any segment?

Not all indexes can be retrospectively applied to existing segments.

If you want to add or change the or adjust you will need to manually re-load any existing segments.

How to create star-tree indexes?

Star-tree indexes are configured in the table config under the tableIndexConfig -> starTreeIndexConfigs (list) and enableDefaultStarTree (boolean). See here for more about how to configure star-tree indexes:

The new segments will have star-tree indexes generated after applying the star-tree index configurations to the table configuration.

Handling time in Pinot

How does Pinot’s real-time ingestion handle out-of-order events?

Pinot does not require ordering of event time stamps. Out of order events are still consumed and indexed into the "currently consuming" segment. In a pathological case, if you have a 2 day old event come in "now", it will still be stored in the segment that is open for consumption "now". There is no strict time-based partitioning for segments, but star-indexes and hybrid tables will handle this as appropriate.

See the for more details about how hybrid tables handle this. Specifically, the time-boundary is computed as max(OfflineTIme) - 1 unit of granularity. Pinot does store the min-max time for each segment and uses it for pruning segments, so segments with multiple time intervals may not be perfectly pruned.

When generating star-indexes, the time column will be part of the star-tree so the tree can still be efficiently queried for segments with multiple time intervals.

What is the purpose of a hybrid table not using `max(OfflineTime)` to determine the time-boundary, and instead using an offset?

This lets you have an old event up come in without building complex offline pipelines that perfectly partition your events by event timestamps. With this offset, even if your offline data pipeline produces segments with a maximum timestamp, Pinot will not use the offline dataset for that last chunk of segments. The expectation is if you process offline the next time-range of data, your data pipeline will include any late events.

Why are segments not strictly time-partitioned?

It might seem odd that segments are not strictly time-partitioned, unlike similar systems such as Apache Druid. This allows real-time ingestion to consume out-of-order events. Even though segments are not strictly time-partitioned, Pinot will still index, prune, and query segments intelligently by time intervals for the performance of hybrid tables and time-filtered data.

When generating offline segments, the segments generated such that segments only contain one time interval and are well partitioned by the time column.

Query FAQ

This page has a collection of frequently asked questions about queries with answers from the community.

This is a list of questions frequently asked in our troubleshooting channel on Slack. To contribute additional questions and answers, make a pull request.

Querying

I get the following error when running a query, what does it mean?

This implies that the Pinot Broker assigned to the table specified in the query was not found. A common root cause for this is a typo in the table name in the query. Another uncommon reason could be if there wasn't actually a broker with required broker tenant tag for the table.

What are all the fields in the Pinot query's JSON response?

See this page explaining the Pinot response format: .

SQL Query fails with "Encountered 'timestamp' was expecting one of..."

"timestamp" is a reserved keyword in SQL. Escape timestamp with double quotes.

Other commonly encountered reserved keywords are date, time, table.

Filtering on STRING column WHERE column = "foo" does not work?

For filtering on STRING columns, use single quotes:

ORDER BY using an alias doesn't work?

The fields in the ORDER BY clause must be one of the group by clauses or aggregations, BEFORE applying the alias. Therefore, this will not work:

But, this will work:

Does pagination work in GROUP BY queries?

No. Pagination only works for SELECTION queries.

How do I increase timeout for a query ?

You can add this at the end of your query: option(timeoutMs=X). Tthe following example uses a timeout of 20 seconds for the query:

You can also use SET "timeoutMs" = 20000; SELECT COUNT(*) from myTable.

For changing the timeout on the entire cluster, set this property pinot.broker.timeoutMs in either broker configs or cluster configs (using the POST /cluster/configs API from Swagger).

How do I cancel a query?

Add these two configs for Pinot server and broker to start tracking of running queries. The query tracks are added and cleaned as query starts and ends, so should not consume much resource.

Then use the Rest APIs on Pinot controller to list running queries and cancel them via the query ID and broker ID (as query ID is only local to broker), like in the following:

How do I optimize my Pinot table for doing aggregations and group-by on high cardinality columns ?

In order to speed up aggregations, you can enable metrics aggregation on the required column by adding a in the corresponding schema and setting aggregateMetrics to true in the table configuration. You can also use a star-tree index config for columns like these ().

How do I verify that an index is created on a particular column ?

There are two ways to verify this:

Log in to a server that hosts segments of this table. Inside the data directory, locate the segment directory for this table. In this directory, there is a file named index_map which lists all the indexes and other data structures created for each segment. Verify that the requested index is present here.
During query: Use the column in the filter predicate and check the value of numEntriesScannedInFilter. If this value is 0, then indexing is working as expected (works for Inverted index).

Does Pinot use a default value for LIMIT in queries?

Yes, Pinot uses a default value of LIMIT 10 in queries. The reason behind this default value is to avoid unintentionally submitting expensive queries that end up fetching or processing a lot of data from Pinot. Users can always overwrite this by explicitly specifying a LIMIT value.

Does Pinot cache query results?

Pinot does not cache query results. Each query is computed in its entirety. Note though, running the same or similar query multiple times will naturally pull in segment pages into memory making subsequent calls faster. Also, for real-time systems, the data is changing in real-time, so results cannot be cached. For offline-only systems, caching layer can be built on top of Pinot, with invalidation mechanism built-in to invalidate the cache when data is pushed into Pinot.

I'm noticing that the first query is slower than subsequent queries. Why is that?

Pinot memory maps segments. It warms up during the first query, when segments are pulled into the memory by the OS. Subsequent queries will have the segment already loaded in memory, and hence will be faster. The OS is responsible for bringing the segments into memory, and also removing them in favor of other segments when other segments not already in memory are accessed.

How do I determine if the star-tree index is being used for my query?

The query execution engine will prefer to use the star-tree index for all queries where it can be used. The criteria to determine whether the star-tree index can be used is as follows:

All aggregation function + column pairs in the query must exist in the star-tree index.
All dimensions that appear in filter predicates and group-by should be star-tree dimensions.

For queries where above is true, a star-tree index is used. For other queries, the execution engine will default to using the next best index available.

Operations FAQ

This page has a collection of frequently asked questions about operations with answers from the community.

This is a list of questions frequently asked in our troubleshooting channel on Slack. To contribute additional questions and answers, .

Memory

Ingestion FAQ

This page has a collection of frequently asked questions about ingestion with answers from the community.

This is a list of questions frequently asked in our troubleshooting channel on Slack. To contribute additional questions and answers, make a pull request.

Data processing

What is a good segment size?

Can multiple Pinot tables consume from the same Kafka topic?

Yes. Each table can be independently configured to consume from any given Kafka topic, regardless of whether there are other tables that are also consuming from the same Kafka topic.

If I add a partition to a Kafka topic, will Pinot automatically ingest data from this partition?

Pinot automatically detects new partitions in Kafka topics. It checks for new partitions whenever RealtimeSegmentValidationManager periodic job runs and starts consumers for new partitions.

You can configure the interval for this job using thecontroller.realtime.segment.validation.frequencyPeriod property in the controller configuration.

Does Pinot support partition pruning on multiple partition columns?

For multi-column partitioning to work, you must also set routing.segementPrunerTypes as follows:

How do I enable partitioning in Pinot when using Kafka stream?

Set up partitioner in the Kafka producer:

The partitioning logic in the stream should match the partitioning config in Pinot. Kafka uses murmur2, and the equivalent in Pinot is the Murmur function.

Set the partitioning configuration as below using same column used in Kafka:

and also set:

To learn how partition works, see .

How do I store BYTES column in JSON data?

For JSON, you can use a hex encoded string to ingest BYTES.

How do I flatten my JSON Kafka stream?

See the function which can store a top level json field as a STRING in Pinot.

Then you can use these during query time, to extract fields from the json string.

How do I escape Unicode in my Job Spec YAML file?

Is there a limit on the maximum length of a string column in Pinot?

By default, Pinot limits the length of a String column to 512 bytes. If you want to overwrite this value, you can set the maxLength attribute in the schema as follows:

When are new events queryable when getting ingested into a real-time table?

Events are available to queries as soon as they are ingested. This is because events are instantly indexed in memory upon ingestion.

However, when the open segment is closed and its in-memory indexes are flushed to persistent storage, all its replicas are guaranteed to be consistent, with the .

How to reset a CONSUMING segment stuck on an offset which has expired from the stream?

This typically happens if:

The consumer is lagging a lot.
The consumer was down (server down, cluster down), and the stream moved on, resulting in offset not found when consumer comes back up.

In case of Kafka, to recover, set property "auto.offset.reset":"earliest" in the streamConfigs section and reset the CONSUMING segment. See for more details about the configuration.

You can also also use the "Resume Consumption" endpoint with "resumeFrom" parameter set to "smallest" (or "largest" if you want). See for more details.

Indexing

How to set inverted indexes?

Applying inverted indexes to a table configuration will generate an inverted index for all new segments. To apply the inverted indexes to all existing segments, see

How to apply an inverted index to existing segments?

Add the columns you want to index to the tableIndexConfig-> invertedIndexColumns list. To update the table configuration use the Pinot Swagger API: .
Invoke the reload API: .

Once you've done that, you can check whether the index has been applied by querying the segment metadata API at . Don't forget to include the names of the column on which you have applied the index.

The output from this API should look something like the following:

Can I retrospectively add an index to any segment?

Not all indexes can be retrospectively applied to existing segments.

If you want to add or change the or adjust you will need to manually re-load any existing segments.

How to create star-tree indexes?

The new segments will have star-tree indexes generated after applying the star-tree index configurations to the table configuration.

Handling time in Pinot

How does Pinot’s real-time ingestion handle out-of-order events?

When generating star-indexes, the time column will be part of the star-tree so the tree can still be efficiently queried for segments with multiple time intervals.

What is the purpose of a hybrid table not using `max(OfflineTime)` to determine the time-boundary, and instead using an offset?

Why are segments not strictly time-partitioned?

When generating offline segments, the segments generated such that segments only contain one time interval and are well partitioned by the time column.

Query FAQ

This page has a collection of frequently asked questions about queries with answers from the community.

This is a list of questions frequently asked in our troubleshooting channel on Slack. To contribute additional questions and answers, make a pull request.

Querying

I get the following error when running a query, what does it mean?

What are all the fields in the Pinot query's JSON response?

See this page explaining the Pinot response format: .

SQL Query fails with "Encountered 'timestamp' was expecting one of..."

"timestamp" is a reserved keyword in SQL. Escape timestamp with double quotes.

Other commonly encountered reserved keywords are date, time, table.

Filtering on STRING column WHERE column = "foo" does not work?

For filtering on STRING columns, use single quotes:

ORDER BY using an alias doesn't work?

The fields in the ORDER BY clause must be one of the group by clauses or aggregations, BEFORE applying the alias. Therefore, this will not work:

But, this will work:

Does pagination work in GROUP BY queries?

No. Pagination only works for SELECTION queries.

How do I increase timeout for a query ?

You can add this at the end of your query: option(timeoutMs=X). Tthe following example uses a timeout of 20 seconds for the query:

You can also use SET "timeoutMs" = 20000; SELECT COUNT(*) from myTable.

For changing the timeout on the entire cluster, set this property pinot.broker.timeoutMs in either broker configs or cluster configs (using the POST /cluster/configs API from Swagger).

How do I cancel a query?

Add these two configs for Pinot server and broker to start tracking of running queries. The query tracks are added and cleaned as query starts and ends, so should not consume much resource.

Then use the Rest APIs on Pinot controller to list running queries and cancel them via the query ID and broker ID (as query ID is only local to broker), like in the following:

How do I optimize my Pinot table for doing aggregations and group-by on high cardinality columns ?

How do I verify that an index is created on a particular column ?

There are two ways to verify this:

Log in to a server that hosts segments of this table. Inside the data directory, locate the segment directory for this table. In this directory, there is a file named index_map which lists all the indexes and other data structures created for each segment. Verify that the requested index is present here.
During query: Use the column in the filter predicate and check the value of numEntriesScannedInFilter. If this value is 0, then indexing is working as expected (works for Inverted index).

Does Pinot use a default value for LIMIT in queries?

Does Pinot cache query results?

I'm noticing that the first query is slower than subsequent queries. Why is that?

How do I determine if the star-tree index is being used for my query?

The query execution engine will prefer to use the star-tree index for all queries where it can be used. The criteria to determine whether the star-tree index can be used is as follows:

All aggregation function + column pairs in the query must exist in the star-tree index.
All dimensions that appear in filter predicates and group-by should be star-tree dimensions.

For queries where above is true, a star-tree index is used. For other queries, the execution engine will default to using the next best index available.

Frequently Asked Questions (FAQs)

General

hashtagHow does Apache Pinot use deep storage?

hashtagHow does Pinot use Zookeeper?

hashtagWhy am I getting "Could not find or load class" error when running Quickstart using 0.8.0 release?

hashtagHow to change TimeZone when running Pinot?

Pinot On Kubernetes FAQ

hashtagHow to increase server disk size on AWS

hashtag1. Update Storage Class

hashtag2. Update PVC

hashtag3. Restart pod to let it reflect

Ingestion FAQ

hashtagData processing

hashtagWhat is a good segment size?

hashtagCan multiple Pinot tables consume from the same Kafka topic?

hashtagIf I add a partition to a Kafka topic, will Pinot automatically ingest data from this partition?

hashtagDoes Pinot support partition pruning on multiple partition columns?

hashtagHow do I enable partitioning in Pinot when using Kafka stream?

hashtagHow do I store BYTES column in JSON data?

hashtagHow do I flatten my JSON Kafka stream?

hashtagHow do I escape Unicode in my Job Spec YAML file?

hashtagIs there a limit on the maximum length of a string column in Pinot?

hashtagWhen are new events queryable when getting ingested into a real-time table?

hashtagHow to reset a CONSUMING segment stuck on an offset which has expired from the stream?

hashtagIndexing

hashtagHow to set inverted indexes?

hashtagHow to apply an inverted index to existing segments?

hashtagCan I retrospectively add an index to any segment?

hashtagHow to create star-tree indexes?

hashtagHandling time in Pinot

hashtagHow does Pinot’s real-time ingestion handle out-of-order events?

hashtagWhat is the purpose of a hybrid table not using max(OfflineTime) to determine the time-boundary, and instead using an offset?

hashtagWhy are segments not strictly time-partitioned?

Query FAQ

hashtagQuerying

hashtagI get the following error when running a query, what does it mean?

hashtagWhat are all the fields in the Pinot query's JSON response?

hashtagSQL Query fails with "Encountered 'timestamp' was expecting one of..."

hashtagFiltering on STRING column WHERE column = "foo" does not work?

hashtagORDER BY using an alias doesn't work?

hashtagDoes pagination work in GROUP BY queries?

hashtagHow do I increase timeout for a query ?

hashtagHow do I cancel a query?

hashtagHow do I optimize my Pinot table for doing aggregations and group-by on high cardinality columns ?

hashtagHow do I verify that an index is created on a particular column ?

hashtagDoes Pinot use a default value for LIMIT in queries?

hashtagDoes Pinot cache query results?

hashtagI'm noticing that the first query is slower than subsequent queries. Why is that?

hashtagHow do I determine if the star-tree index is being used for my query?

Operations FAQ

hashtagMemory

General

hashtagHow does Apache Pinot use deep storage?

hashtagHow does Pinot use Zookeeper?

hashtagWhy am I getting "Could not find or load class" error when running Quickstart using 0.8.0 release?

hashtagHow to change TimeZone when running Pinot?

Frequently Asked Questions (FAQs)

Pinot On Kubernetes FAQ

hashtagHow to increase server disk size on AWS

hashtag1. Update Storage Class

hashtag2. Update PVC

hashtag3. Restart pod to let it reflect

Ingestion FAQ

hashtagData processing

hashtagWhat is a good segment size?

hashtagCan multiple Pinot tables consume from the same Kafka topic?

hashtagIf I add a partition to a Kafka topic, will Pinot automatically ingest data from this partition?

hashtagDoes Pinot support partition pruning on multiple partition columns?

hashtagHow do I enable partitioning in Pinot when using Kafka stream?

hashtagHow do I store BYTES column in JSON data?

hashtagHow do I flatten my JSON Kafka stream?

hashtagHow do I escape Unicode in my Job Spec YAML file?

hashtagIs there a limit on the maximum length of a string column in Pinot?

hashtagWhen are new events queryable when getting ingested into a real-time table?

hashtagHow to reset a CONSUMING segment stuck on an offset which has expired from the stream?

hashtagIndexing

hashtagHow to set inverted indexes?

hashtagHow to apply an inverted index to existing segments?

hashtagCan I retrospectively add an index to any segment?

hashtagHow to create star-tree indexes?

How does Apache Pinot use deep storage?

How does Pinot use Zookeeper?

Why am I getting "Could not find or load class" error when running Quickstart using 0.8.0 release?

How to change TimeZone when running Pinot?

How to increase server disk size on AWS

1. Update Storage Class

2. Update PVC

3. Restart pod to let it reflect

Data processing

What is a good segment size?

Can multiple Pinot tables consume from the same Kafka topic?

If I add a partition to a Kafka topic, will Pinot automatically ingest data from this partition?

Does Pinot support partition pruning on multiple partition columns?

How do I enable partitioning in Pinot when using Kafka stream?

How do I store BYTES column in JSON data?

How do I flatten my JSON Kafka stream?

How do I escape Unicode in my Job Spec YAML file?

Is there a limit on the maximum length of a string column in Pinot?

When are new events queryable when getting ingested into a real-time table?

How to reset a CONSUMING segment stuck on an offset which has expired from the stream?

Indexing

How to set inverted indexes?

How to apply an inverted index to existing segments?

Can I retrospectively add an index to any segment?

How to create star-tree indexes?

Handling time in Pinot

How does Pinot’s real-time ingestion handle out-of-order events?

What is the purpose of a hybrid table not using `max(OfflineTime)` to determine the time-boundary, and instead using an offset?

Why are segments not strictly time-partitioned?

Querying

I get the following error when running a query, what does it mean?

What are all the fields in the Pinot query's JSON response?

SQL Query fails with "Encountered 'timestamp' was expecting one of..."

Filtering on STRING column WHERE column = "foo" does not work?

ORDER BY using an alias doesn't work?

Does pagination work in GROUP BY queries?

How do I increase timeout for a query ?

How do I cancel a query?

How do I optimize my Pinot table for doing aggregations and group-by on high cardinality columns ?

How do I verify that an index is created on a particular column ?

Does Pinot use a default value for LIMIT in queries?

Does Pinot cache query results?

I'm noticing that the first query is slower than subsequent queries. Why is that?

How do I determine if the star-tree index is being used for my query?

Memory

How does Apache Pinot use deep storage?

How does Pinot use Zookeeper?

Why am I getting "Could not find or load class" error when running Quickstart using 0.8.0 release?

How to change TimeZone when running Pinot?

How to increase server disk size on AWS

1. Update Storage Class

2. Update PVC

3. Restart pod to let it reflect

Data processing

What is a good segment size?

Can multiple Pinot tables consume from the same Kafka topic?

If I add a partition to a Kafka topic, will Pinot automatically ingest data from this partition?

Does Pinot support partition pruning on multiple partition columns?

How do I enable partitioning in Pinot when using Kafka stream?

How do I store BYTES column in JSON data?

How do I flatten my JSON Kafka stream?

How do I escape Unicode in my Job Spec YAML file?

Is there a limit on the maximum length of a string column in Pinot?

When are new events queryable when getting ingested into a real-time table?

How to reset a CONSUMING segment stuck on an offset which has expired from the stream?

Indexing

How to set inverted indexes?

How to apply an inverted index to existing segments?

Can I retrospectively add an index to any segment?

How to create star-tree indexes?

Handling time in Pinot

How does Pinot’s real-time ingestion handle out-of-order events?

What is the purpose of a hybrid table not using `max(OfflineTime)` to determine the time-boundary, and instead using an offset?

Why are segments not strictly time-partitioned?

Memory

DR

Does Pinot provide any backup/restore mechanism?

Alter Table

Can I change a column name in my table, without losing data?

How to change number of replicas of a table?