This quickstart assumes that you already have a running Kubernetes cluster. Please follow the links below to set up a Kubernetes cluster.
(make sure to run with enough resources e.g. minikube start --vm=true --cpus=4 --memory=8g --disk-size=50g)
2. Setting up a Pinot cluster in Kubernetes
Before continuing, please make sure that you've downloaded Apache Pinot. The scripts for the setup in this guide can be found in our open source project on GitHub.
The scripts can be found in the Pinot source at ./pinot/kubernetes/helm
# checkout pinot
git clone https://github.com/apache/pinot.git
cd pinot/kubernetes/helm
2.1 Start Pinot with Helm
Pinot repo has pre-packaged HelmCharts for Pinot and Presto. Helm Repo index file is .
NOTE: Please specify StorageClass based on your cloud vendor. For Pinot Server, please don't mount blob store like AzureFile/GoogleCloudStorage/S3 as the data serving file system.
Only use Amazon EBS/GCP Persistent Disk/Azure Disk style disks.
For AWS: "gp2"
For GCP: "pd-ssd" or "standard"
For Azure: "AzureDisk"
For Docker-Desktop: "hostpath"
2.1.1 Update helm dependency
helm dependency update
2.1.2 Start Pinot with Helm
For Helm v2.12.1
If your Kubernetes cluster is recently provisioned, ensure Helm is initialized by running:
helm init --service-account tiller
Then deploy a new HA Pinot cluster using the following command:
Error: Please run the command below if encountering a permission issue:
Error: release pinot failed: namespaces "pinot-quickstart" is forbidden: User "system:serviceaccount:kube-system:default" cannot get resource "namespaces" in API group "" in the namespace "pinot-quickstart"
Resolution:
kubectl apply -f helm-rbac.yaml
2.2 Check Pinot deployment status
kubectl get all -n pinot-quickstart
3. Load data into Pinot using Kafka
3.1 Bring up a Kafka cluster for real-time data ingestion
Please use the script below to perform local port-forwarding, which will also open Pinot query console in your default web browser.
This script can be found in the Pinot source at ./pinot/kubernetes/helm/pinot
./query-pinot-data.sh
5. Using Superset to query Pinot
5.1 Bring up Superset
Open superset.yaml file and goto the line showing storageClass. And change it based on your cloud vendor. kubectl get sc will get you the storageClass value for your Kubernetes system. E.g.
For AWS: "gp2"
For GCP: "pd-ssd" or "standard"
For Azure: "AzureDisk"
For Docker-Desktop: "hostpath"
Then run:
kubectl apply -f superset.yaml
Ensure your cluster is up by running:
kubectl get all -n pinot-quickstart | grep superset
The above command adds Trino HelmChart repo. You can then run the below command to see the charts.
helm search repo trino
In order to connect Trino to Pinot, we need to add Pinot catalog, which requires extra configurations. You can run the below command to get all the configurable values.
The above command deploys Presto with default configs. For customizing your deployment, you can run the below command to get all the configurable values.