LogoLogo
release-0.9.0
release-0.9.0
  • Introduction
  • Basics
    • Concepts
    • Architecture
    • Components
      • Cluster
      • Controller
      • Broker
      • Server
      • Minion
      • Tenant
      • Schema
      • Table
      • Segment
      • Pinot Data Explorer
    • Getting Started
      • Running Pinot locally
      • Running Pinot in Docker
      • Running Pinot in Kubernetes
      • Public cloud examples
        • Running on Azure
        • Running on GCP
        • Running on AWS
      • Hdfs as Deep Storage
      • Manual cluster setup
      • Batch import example
      • Stream ingestion example
      • Troubleshooting Pinot
      • Frequently Asked Questions (FAQs)
        • General
        • Pinot On Kubernetes FAQ
        • Ingestion FAQ
        • Query FAQ
        • Operations FAQ
    • Import Data
      • Batch Ingestion
        • Spark
        • Hadoop
        • Backfill Data
        • Dimension Table
      • Stream ingestion
        • Apache Kafka
        • Amazon Kinesis
      • Stream Ingestion with Upsert
      • File systems
        • Amazon S3
        • Azure Data Lake Storage
        • HDFS
        • Google Cloud Storage
      • Input formats
      • Complex Type (Array, Map) Handling
    • Indexing
      • Forward Index
      • Inverted Index
      • Star-Tree Index
      • Bloom Filter
      • Range Index
      • Text search support
      • JSON Index
      • Geospatial
    • Releases
      • 0.9.0
      • 0.8.0
      • 0.7.1
      • 0.6.0
      • 0.5.0
      • 0.4.0
      • 0.3.0
      • 0.2.0
      • 0.1.0
    • Recipes
      • GitHub Events Stream
  • For Users
    • Query
      • Querying Pinot
      • Filtering with IdSet
      • Supported Transformations
      • Supported Aggregations
      • User-Defined Functions (UDFs)
      • Cardinality Estimation
      • Lookup UDF Join
      • Querying JSON data
    • APIs
      • Broker Query API
        • Query Response Format
      • Controller Admin API
    • External Clients
      • JDBC
      • Java
      • Python
      • Golang
    • Tutorials
      • Use OSS as Deep Storage for Pinot
      • Ingest Parquet Files from S3 Using Spark
      • Creating Pinot Segments
      • Use S3 as Deep Storage for Pinot
      • Use S3 and Pinot in Docker
      • Batch Data Ingestion In Practice
      • Schema Evolution
  • For Developers
    • Basics
      • Extending Pinot
        • Writing Custom Aggregation Function
        • Segment Fetchers
      • Contribution Guidelines
      • Code Setup
      • Code Modules and Organization
      • Update Documentation
    • Advanced
      • Data Ingestion Overview
      • Ingestion Transformations
      • Null Value Support
      • Advanced Pinot Setup
    • Plugins
      • Write Custom Plugins
        • Input Format Plugin
        • Filesystem Plugin
        • Batch Segment Fetcher Plugin
        • Stream Ingestion Plugin
    • Design Documents
      • Segment Writer API
  • For Operators
    • Deployment and Monitoring
      • Setup cluster
      • Setup table
      • Setup ingestion
      • Decoupling Controller from the Data Path
      • Segment Assignment
      • Instance Assignment
      • Rebalance
        • Rebalance Servers
        • Rebalance Brokers
      • Tiered Storage
      • Pinot managed Offline flows
      • Minion merge rollup task
      • Access Control
      • Monitoring
      • Tuning
        • Realtime
        • Routing
      • Upgrading Pinot with confidence
    • Command-Line Interface (CLI)
    • Configuration Recommendation Engine
    • Tutorials
      • Authentication, Authorization, and ACLs
      • Configuring TLS/SSL
      • Build Docker Images
      • Running Pinot in Production
      • Kubernetes Deployment
      • Amazon EKS (Kafka)
      • Amazon MSK (Kafka)
      • Monitor Pinot using Prometheus and Grafana
  • Configuration Reference
    • Cluster
    • Controller
    • Broker
    • Server
    • Table
    • Schema
    • Ingestion Job Spec
  • RESOURCES
    • Community
    • Team
    • Blogs
    • Presentations
    • Videos
  • Integrations
    • Tableau
    • Trino
    • ThirdEye
    • Superset
    • Presto
Powered by GitBook
On this page
  • Batch example
  • Streaming example
  • Hybrid example

Was this helpful?

Export as PDF
  1. Basics
  2. Getting Started

Running Pinot in Docker

This quick start guide will show you how to run a Pinot cluster using Docker.

PreviousRunning Pinot locallyNextRunning Pinot in Kubernetes

Last updated 3 years ago

Was this helpful?

This is a quickstart guide that will show you how to quickly start an example recipe in a standalone instance and is meant for learning. To run Pinot in cluster mode, please take a look at .

Prerequisites

Install

You can also try if you already have a local cluster installed or setup.

If running locally, please ensure your docker cluster has enough resources, below is a sample config.

We'll be using our docker image apachepinot/pinot:latest to run this quick start, which does the following:

  • Sets up the Pinot cluster

  • Creates a sample table and loads sample data

The following quick-start scripts are available

  • Batch example

  • Streaming example

  • Hybrid example

Before running the scripts, create an isolated bridge network pinot-demo in docker. This will allow all docker containers to easily communicate with each other. You can create the network using the following command -

docker network create -d bridge pinot-demo

Batch example

In this example we demonstrate how to do batch processing with Pinot.

  • Starts Pinot deployment by starting

    • Apache Zookeeper

    • Pinot Controller

    • Pinot Broker

    • Pinot Server

  • Creates a demo table

    • baseballStats

  • Launches a standalone data ingestion job

    • Builds one Pinot segment for a given CSV data file for table baseballStats

    • Pushes the built segment to the Pinot controller

  • Issues sample queries to Pinot

docker run \
    --network=pinot-demo \
    --name pinot-quickstart \
    -p 9000:9000 \
    -d apachepinot/pinot:latest QuickStart \
    -type batch

Once the Docker container is running, you can view the logs by running the following command.

docker logs pinot-quickstart -f

That's it! We've spun up a Pinot cluster.

It may take a while for all the Pinot components to start and for the sample data to be loaded.

Use the below command to check the status in the container logs.

docker logs pinot-quickstart -f

Your cluster is ready once you see the cluster setup completion messages and sample queries, as demonstrated below.

Streaming example

In this example we demonstrate how to do stream processing with Pinot.

  • Starts Pinot deployment by starting

    • Apache Kafka

    • Apache Zookeeper

    • Pinot Controller

    • Pinot Broker

    • Pinot Server

  • Creates a demo table

    • meetupRsvp

  • Launches a meetup stream

  • Publishes data to a Kafka topic meetupRSVPEvents to be subscribed to by Pinot

  • Issues sample queries to Pinot

# stop previous container, if any, or use different network
docker run \
    --network=pinot-demo \
    --name pinot-quickstart \
    -p 9000:9000 \
    -d apachepinot/pinot:latest QuickStart \
    -type stream

Hybrid example

In this example we demonstrate how to do hybrid stream and batch processing with Pinot.

  1. Starts Pinot deployment by starting

    • Apache Kafka

    • Apache Zookeeper

    • Pinot Controller

    • Pinot Broker

    • Pinot Server

  2. Creates a demo table

    • airlineStats

  3. Launches a standalone data ingestion job

    • Builds Pinot segments under a given directory of Avro files for table airlineStats

    • Pushes built segments to Pinot controller

  4. Launches a stream of flights stats

  5. Publishes data to a Kafka topic airlineStatsEvents to be subscribed to by Pinot

  6. Issues sample queries to Pinot

# stop previous container, if any, or use different network
docker run \
    --network=pinot-demo \
    --name pinot-quickstart \
    -p 9000:9000 \
    -d apachepinot/pinot:latest QuickStart \
    -type hybrid

You can head over to to check out the data in the baseballStats table.

Once the cluster is up, you can head over to to check out the data in the meetupRSVPEvents table.

Once the cluster is up, you can head over to to check out the data in the airlineStats table.

Exploring Pinot
Exploring Pinot
Exploring Pinot
Manual cluster setup
Docker
Kubernetes quick start
minikube
Docker Kubernetes
Cluster Setup Completion Example