githubEdit

Stream Ingestion (Docker)

Step-by-step guide for streaming ingestion into Pinot running in Docker

This guide walks you through setting up real-time stream ingestion into a Pinot cluster running in Docker. Make sure you have completed Running Pinot in Docker first.

Set up Kafka

Pinot has out-of-the-box real-time ingestion support for Kafka. Other streams can be plugged in, see Pluggable Streams.

Start Kafka:

docker run \
    --network pinot-demo --name=kafka \
    -e KAFKA_ZOOKEEPER_CONNECT=manual-zookeeper:2181/kafka \
    -e KAFKA_BROKER_ID=0 \
    -e KAFKA_ADVERTISED_HOST_NAME=kafka \
    -d bitnami/kafka:latest

Create a Kafka topic:

docker exec \
  -t kafka \
  /opt/kafka/bin/kafka-topics.sh \
  --zookeeper manual-zookeeper:2181/kafka \
  --partitions=1 --replication-factor=1 \
  --create --topic transcript-topic

Create a schema

If you already pushed a schema during the Batch ingestion example, you can reuse it. Otherwise, see Creating a schema to learn how to create one.

Create a table configuration

Upload the schema and table configuration

As soon as the real-time table is created, it will begin ingesting from the Kafka topic.

Load sample data into the stream

Push the sample JSON into the Kafka topic:

Query your data

As soon as data flows into the stream, Pinot will consume it and make it available for querying. Open the Query Consolearrow-up-right to examine the real-time data.

Last updated

Was this helpful?