Ingest streaming data from Apache Kafka
This guide shows you how to ingest a stream of records from an Apache Kafka topic into a Pinot table.
Install and Launch Kafka
docker pull wurstmeister/kafka:latesttar -xzf kafka_2.13-3.7.0.tgz
cd kafka_2.13-3.7.0docker run --network pinot-demo --name=kafka -e KAFKA_ZOOKEEPER_CONNECT=zookeeper:2181/kafka -e KAFKA_BROKER_ID=0 -e KAFKA_ADVERTISED_HOST_NAME=kafka wurstmeister/kafka:latestbin/zookeeper-server-start.sh config/zookeeper.propertiesbin/kafka-server-start.sh config/server.propertiesData Source
import datetime
import uuid
import random
import json
while True:
ts = int(datetime.datetime.now().timestamp()* 1000)
id = str(uuid.uuid4())
count = random.randint(0, 1000)
print(
json.dumps({"ts": ts, "uuid": id, "count": count})
)
Ingesting Data into Kafka
Schema
Table Config
Create schema and table
Querying
Kafka ingestion guidelines
Kafka versions in Pinot
Upgrade from Kafka 0.9 connector to Kafka 2.x connector
How to consume from a Kafka version > 2.0.0
Kafka configurations in Pinot
Use Kafka partition (low) level consumer with SSL
Consume transactionally-committed messages
Use Kafka partition (low) level consumer with SASL_SSL
Extract record headers as Pinot table columns
Kafka Record
Pinot Table Column
Description
Tell Pinot where to find an Avro schema
Last updated
Was this helpful?


