Ingestion FAQ
This page has a collection of frequently asked questions about ingestion with answers from the community.
Data processing
What is a good segment size?
Can multiple Pinot tables consume from the same Kafka topic?
If I add a partition to a Kafka topic, will Pinot automatically ingest data from this partition?
Does Pinot support partition pruning on multiple partition columns?
"tableIndexConfig": {
..
"segmentPartitionConfig": {
"columnPartitionMap": {
"memberId": {
"functionName": "Modulo",
"numPartitions": 3
},
"caseNumber": {
"functionName": "Murmur",
"numPartitions": 12
}
}
}How do I enable partitioning in Pinot when using Kafka stream?
How do I store BYTES column in JSON data?
How do I flatten my JSON Kafka stream?
How do I escape Unicode in my Job Spec YAML file?
Is there a limit on the maximum length of a string column in Pinot?
When are new events queryable when getting ingested into a real-time table?
How to reset a CONSUMING segment stuck on an offset which has expired from the stream?
Indexing
How to set inverted indexes?
How to apply an inverted index to existing segments?
Can I retrospectively add an index to any segment?
How to create star-tree indexes?
Handling time in Pinot
How does Pinot’s real-time ingestion handle out-of-order events?
What is the purpose of a hybrid table not using max(OfflineTime) to determine the time-boundary, and instead using an offset?
max(OfflineTime) to determine the time-boundary, and instead using an offset?Why are segments not strictly time-partitioned?
Was this helpful?

