Steps for setting up a Pinot cluster and a realtime table which consumes from the GitHub events stream.
In this recipe, we will
Set up a Pinot cluster, in the steps
a. Start zookeeper
b. Start controller
c. Start broker
d. Start server
Set up a Kafka cluster
Create a Kafka topic - pullRequestMergedEvents
Create a realtime table - pullRequestMergedEvents and a schema
Start a task which reads from GitHub events API and publishes events about merged pull requests to the topic.
Query the realtime data
If you already have a Kubernetes cluster with Pinot and Kafka (see Running Pinot in Kubernetes), first create the topic and then setup the table and streaming using
Head over to the Query Console to checkout the data!
You can use SuperSet to visualize this data. Some of the interesting insights we captures were
Repositories by number of commits in the Apache organization
To integrate with SuperSet you can check out the SuperSet Integrations page.