LogoLogo
release-0.9.0
release-0.9.0
  • Introduction
  • Basics
    • Concepts
    • Architecture
    • Components
      • Cluster
      • Controller
      • Broker
      • Server
      • Minion
      • Tenant
      • Schema
      • Table
      • Segment
      • Pinot Data Explorer
    • Getting Started
      • Running Pinot locally
      • Running Pinot in Docker
      • Running Pinot in Kubernetes
      • Public cloud examples
        • Running on Azure
        • Running on GCP
        • Running on AWS
      • Hdfs as Deep Storage
      • Manual cluster setup
      • Batch import example
      • Stream ingestion example
      • Troubleshooting Pinot
      • Frequently Asked Questions (FAQs)
        • General
        • Pinot On Kubernetes FAQ
        • Ingestion FAQ
        • Query FAQ
        • Operations FAQ
    • Import Data
      • Batch Ingestion
        • Spark
        • Hadoop
        • Backfill Data
        • Dimension Table
      • Stream ingestion
        • Apache Kafka
        • Amazon Kinesis
      • Stream Ingestion with Upsert
      • File systems
        • Amazon S3
        • Azure Data Lake Storage
        • HDFS
        • Google Cloud Storage
      • Input formats
      • Complex Type (Array, Map) Handling
    • Indexing
      • Forward Index
      • Inverted Index
      • Star-Tree Index
      • Bloom Filter
      • Range Index
      • Text search support
      • JSON Index
      • Geospatial
    • Releases
      • 0.9.0
      • 0.8.0
      • 0.7.1
      • 0.6.0
      • 0.5.0
      • 0.4.0
      • 0.3.0
      • 0.2.0
      • 0.1.0
    • Recipes
      • GitHub Events Stream
  • For Users
    • Query
      • Querying Pinot
      • Filtering with IdSet
      • Supported Transformations
      • Supported Aggregations
      • User-Defined Functions (UDFs)
      • Cardinality Estimation
      • Lookup UDF Join
      • Querying JSON data
    • APIs
      • Broker Query API
        • Query Response Format
      • Controller Admin API
    • External Clients
      • JDBC
      • Java
      • Python
      • Golang
    • Tutorials
      • Use OSS as Deep Storage for Pinot
      • Ingest Parquet Files from S3 Using Spark
      • Creating Pinot Segments
      • Use S3 as Deep Storage for Pinot
      • Use S3 and Pinot in Docker
      • Batch Data Ingestion In Practice
      • Schema Evolution
  • For Developers
    • Basics
      • Extending Pinot
        • Writing Custom Aggregation Function
        • Segment Fetchers
      • Contribution Guidelines
      • Code Setup
      • Code Modules and Organization
      • Update Documentation
    • Advanced
      • Data Ingestion Overview
      • Ingestion Transformations
      • Null Value Support
      • Advanced Pinot Setup
    • Plugins
      • Write Custom Plugins
        • Input Format Plugin
        • Filesystem Plugin
        • Batch Segment Fetcher Plugin
        • Stream Ingestion Plugin
    • Design Documents
      • Segment Writer API
  • For Operators
    • Deployment and Monitoring
      • Setup cluster
      • Setup table
      • Setup ingestion
      • Decoupling Controller from the Data Path
      • Segment Assignment
      • Instance Assignment
      • Rebalance
        • Rebalance Servers
        • Rebalance Brokers
      • Tiered Storage
      • Pinot managed Offline flows
      • Minion merge rollup task
      • Access Control
      • Monitoring
      • Tuning
        • Realtime
        • Routing
      • Upgrading Pinot with confidence
    • Command-Line Interface (CLI)
    • Configuration Recommendation Engine
    • Tutorials
      • Authentication, Authorization, and ACLs
      • Configuring TLS/SSL
      • Build Docker Images
      • Running Pinot in Production
      • Kubernetes Deployment
      • Amazon EKS (Kafka)
      • Amazon MSK (Kafka)
      • Monitor Pinot using Prometheus and Grafana
  • Configuration Reference
    • Cluster
    • Controller
    • Broker
    • Server
    • Table
    • Schema
    • Ingestion Job Spec
  • RESOURCES
    • Community
    • Team
    • Blogs
    • Presentations
    • Videos
  • Integrations
    • Tableau
    • Trino
    • ThirdEye
    • Superset
    • Presto
Powered by GitBook
On this page

Was this helpful?

Export as PDF
  1. Integrations

Trino

Integrate with Trino for ad-hoc queries with Full SQL

Start running Trino Image with pre-built Trino Pinot connector.

1. Install Pinot

Run below command to install Pinot using HelmCharts.

helm repo add pinot https://raw.githubusercontent.com/apache/pinot/master/kubernetes/helm
kubectl create ns pinot-quickstart
helm install pinot pinot/pinot -n pinot-quickstart --set cluster.name=pinot

You can create an example table by running the command below, if you are under pinot binary directory

Before PR https://github.com/trinodb/trino/pull/7630 got merged, Change the Pinot table name to all lower cases for table conf/schema and ingestion job spec

bin/pinot-admin.sh BootstrapTable -dir examples/batch/airlineStats

2. Install Trino

Run below command to add Trino Helm repo and fetch default helm config file:

helm repo add trino https://trinodb.github.io/charts/
helm inspect values trino/trino > /tmp/trino.yaml

Edit /tmp/trino.yaml file to add your pinot catalog:

Here, since pinot is deployed at namespace pinot-quickstart, so the controller url is pinot-controller.pinot-quickstart:9000

additionalCatalogs:
  pinot: |
    connector.name=pinot
    pinot.controller-urls=pinot-controller.pinot-quickstart:9000

Install Trino HelmCharts with config file:

helm create ns trino-quickstart
helm install my-trino trino/trino --version 0.2.0 --values /tmp/trino.yaml -n trino-quickstart 

3. Connecting to Trino

Download Trino Client.

curl -L https://repo1.maven.org/maven2/io/trino/trino-cli/363/trino-cli-363-executable.jar -o /tmp/trino && chmod +x /tmp/trino

Port forward Trino service to your local if it's not already exposed.

export POD_NAME=$(kubectl get pods --namespace trino-quickstart -l "app=trino,release=my-trino,component=coordinator" -o jsonpath="{.items[0].metadata.name}")
echo "Visit http://127.0.0.1:8080 to use your application"
kubectl port-forward $POD_NAME 8080:8080 --namespace trino-quickstart

Use Trino console client to connect to Trino service

/tmp/trino --server localhost:8080 --catalog pinot --schema default

Then write your own queries:

trino:default> show tables;
     Table
---------------
 airlinestats
 baseballstats
(2 rows)

Query 20211022_214646_00003_gmppt, FINISHED, 3 nodes
Splits: 36 total, 36 done (100.00%)
0.44 [2 rows, 59B] [4 rows/s, 133B/s]
trino:default> select count(*) as flights_from_ca_to_ny from airlinestats where originstate='CA' and deststate='NY';
 flights_from_ca_to_ny
-----------------------
                    68
(1 row)

Query 20211022_215024_00007_gmppt, FINISHED, 1 node
Splits: 17 total, 17 done (100.00%)
2.14 [1 rows, 9B] [0 rows/s, 4B/s]
trino:default> select * from airlinestats limit 1;
 flightnum | origin | quarter | lateaircraftdelay | divactualelapsedtime | divwheelsons | divwheelsoffs |   airtime   |  arrdel15   | divtotalgtimes | deptimeblk | destcitymarketid |  divairportseqids  | dayssinceepoch | deptime | month | crselapsedtime | deststatename | carrier | destairportid | distance | arrtimeblk | divarrdelay | securitydelay | longestaddgtime | originwac | wheelsoff | destairportseqid | uniquecarrier | divreacheddest | diverted | actualelapsedtime | airlineid
-----------+--------+---------+-------------------+----------------------+--------------+---------------+-------------+-------------+----------------+------------+------------------+--------------------+----------------+---------+-------+----------------+---------------+---------+---------------+----------+------------+-------------+---------------+-----------------+-----------+-----------+------------------+---------------+----------------+----------+-------------------+-----------
        24 | SFO    |       1 |       -2147483648 |                 1968 | [1339, 2310] | [1908, 1758]  | -2147483648 | -2147483648 | [121, 50]      | 0700-0759  |            31703 | [1319801, 1072102] |          16075 |     723 |     1 |            335 | New York      | AA      |         12478 |     2586 | 1500-1559  |        1651 |   -2147483648 |     -2147483648 |        91 |       737 |          1247802 | AA            |              1 |        1 |       -2147483648 |     19805
(1 row)

Query 20211022_215050_00008_gmppt, FINISHED, 2 nodes
Splits: 48 total, 19 done (39.58%)
1.90 [2 rows, 1.65KB] [1 rows/s, 887B/s]

Meanwhile you can access Trino Cluster UI to see query stats.

(Disclaimer: Trino is a third-party software that is not part of the Apache Software Foundation).

PreviousTableauNextThirdEye

Last updated 3 years ago

Was this helpful?

Trino Cluster UI