LogoLogo
release-0.10.0
release-0.10.0
  • Introduction
  • Basics
    • Concepts
    • Architecture
    • Components
      • Cluster
      • Controller
      • Broker
      • Server
      • Minion
      • Tenant
      • Schema
      • Table
      • Segment
      • Deep Store
      • Pinot Data Explorer
    • Getting Started
      • Running Pinot locally
      • Running Pinot in Docker
      • Quick Start Examples
      • Running in Kubernetes
      • Running on public clouds
        • Running on Azure
        • Running on GCP
        • Running on AWS
      • Batch import example
      • Stream ingestion example
      • HDFS as Deep Storage
      • Troubleshooting Pinot
      • Frequently Asked Questions (FAQs)
        • General
        • Pinot On Kubernetes FAQ
        • Ingestion FAQ
        • Query FAQ
        • Operations FAQ
    • Import Data
      • Batch Ingestion
        • Spark
        • Hadoop
        • Backfill Data
        • Dimension Table
      • Stream ingestion
        • Apache Kafka
        • Amazon Kinesis
        • Apache Pulsar
      • Stream Ingestion with Upsert
      • File Systems
        • Amazon S3
        • Azure Data Lake Storage
        • HDFS
        • Google Cloud Storage
      • Input formats
      • Complex Type (Array, Map) Handling
    • Indexing
      • Forward Index
      • Inverted Index
      • Star-Tree Index
      • Bloom Filter
      • Range Index
      • Text search support
      • JSON Index
      • Geospatial
      • Timestamp Index
    • Releases
      • 0.10.0
      • 0.9.3
      • 0.9.2
      • 0.9.1
      • 0.9.0
      • 0.8.0
      • 0.7.1
      • 0.6.0
      • 0.5.0
      • 0.4.0
      • 0.3.0
      • 0.2.0
      • 0.1.0
    • Recipes
      • GitHub Events Stream
  • For Users
    • Query
      • Querying Pinot
      • Filtering with IdSet
      • Transformation Functions
      • Aggregation Functions
      • User-Defined Functions (UDFs)
      • Cardinality Estimation
      • Lookup UDF Join
      • Querying JSON data
      • Explain Plan
      • Grouping Algorithm
    • APIs
      • Broker Query API
        • Query Response Format
      • Controller Admin API
    • External Clients
      • JDBC
      • Java
      • Python
      • Golang
    • Tutorials
      • Use OSS as Deep Storage for Pinot
      • Ingest Parquet Files from S3 Using Spark
      • Creating Pinot Segments
      • Use S3 as Deep Storage for Pinot
      • Use S3 and Pinot in Docker
      • Batch Data Ingestion In Practice
      • Schema Evolution
  • For Developers
    • Basics
      • Extending Pinot
        • Writing Custom Aggregation Function
        • Segment Fetchers
      • Contribution Guidelines
      • Code Setup
      • Code Modules and Organization
      • Update Documentation
    • Advanced
      • Data Ingestion Overview
      • Ingestion Transformations
      • Null Value Support
      • Advanced Pinot Setup
    • Plugins
      • Write Custom Plugins
        • Input Format Plugin
        • Filesystem Plugin
        • Batch Segment Fetcher Plugin
        • Stream Ingestion Plugin
    • Design Documents
      • Segment Writer API
  • For Operators
    • Deployment and Monitoring
      • Setup cluster
      • Setup table
      • Setup ingestion
      • Decoupling Controller from the Data Path
      • Segment Assignment
      • Instance Assignment
      • Rebalance
        • Rebalance Servers
        • Rebalance Brokers
      • Tiered Storage
      • Pinot managed Offline flows
      • Minion merge rollup task
      • Access Control
      • Monitoring
      • Tuning
        • Realtime
        • Routing
      • Upgrading Pinot with confidence
    • Command-Line Interface (CLI)
    • Configuration Recommendation Engine
    • Tutorials
      • Authentication, Authorization, and ACLs
      • Configuring TLS/SSL
      • Build Docker Images
      • Running Pinot in Production
      • Kubernetes Deployment
      • Amazon EKS (Kafka)
      • Amazon MSK (Kafka)
      • Monitor Pinot using Prometheus and Grafana
  • Configuration Reference
    • Cluster
    • Controller
    • Broker
    • Server
    • Table
    • Schema
    • Ingestion Job Spec
    • Functions
      • ABS
      • ADD
      • arrayConcatInt
      • arrayConcatString
      • arrayContainsInt
      • arrayContainsString
      • arrayDistinctString
      • arrayDistinctInt
      • arrayIndexOfInt
      • arrayIndexOfString
      • ARRAYLENGTH
      • arrayRemoveInt
      • arrayRemoveString
      • arrayReverseInt
      • arrayReverseString
      • arraySliceInt
      • arraySliceString
      • arraySortInt
      • arraySortString
      • arrayUnionInt
      • arrayUnionString
      • AVGMV
      • ceil
      • CHR
      • codepoint
      • concat
      • count
      • COUNTMV
      • day
      • dayOfWeek
      • dayOfYear
      • DISTINCT
      • DISTINCTCOUNT
      • DISTINCTCOUNTBITMAP
      • DISTINCTCOUNTBITMAPMV
      • DISTINCTCOUNTHLL
      • DISTINCTCOUNTHLLMV
      • DISTINCTCOUNTMV
      • DISTINCTCOUNTRAWHLL
      • DISTINCTCOUNTRAWHLLMV
      • DISTINCTCOUNTRAWTHETASKETCH
      • DISTINCTCOUNTTHETASKETCH
      • DIV
      • DATETIMECONVERT
      • DATETRUNC
      • exp
      • FLOOR
      • FromDateTime
      • FromEpoch
      • FromEpochBucket
      • hour
      • JSONFORMAT
      • JSONPATH
      • JSONPATHARRAY
      • JSONPATHARRAYDEFAULTEMPTY
      • JSONPATHDOUBLE
      • JSONPATHLONG
      • JSONPATHSTRING
      • jsonextractkey
      • jsonextractscalar
      • length
      • ln
      • lower
      • lpad
      • ltrim
      • max
      • MAXMV
      • MD5
      • millisecond
      • min
      • minmaxrange
      • MINMAXRANGEMV
      • MINMV
      • minute
      • MOD
      • mode
      • month
      • mult
      • now
      • percentile
      • percentileest
      • percentileestmv
      • percentilemv
      • percentiletdigest
      • percentiletdigestmv
      • quarter
      • regexpExtract
      • remove
      • replace
      • reverse
      • round
      • rpad
      • rtrim
      • second
      • SEGMENTPARTITIONEDDISTINCTCOUNT
      • sha
      • sha256
      • sha512
      • sqrt
      • startswith
      • ST_AsBinary
      • ST_AsText
      • ST_Contains
      • ST_Distance
      • ST_GeogFromText
      • ST_GeogFromWKB
      • ST_GeometryType
      • ST_GeomFromText
      • ST_GeomFromWKB
      • STPOINT
      • ST_Polygon
      • strpos
      • ST_Union
      • SUB
      • substr
      • sum
      • summv
      • TIMECONVERT
      • timezoneHour
      • timezoneMinute
      • ToDateTime
      • ToEpoch
      • ToEpochBucket
      • ToEpochRounded
      • TOJSONMAPSTR
      • toGeometry
      • toSphericalGeography
      • trim
      • upper
      • VALUEIN
      • week
      • year
      • yearOfWeek
  • RESOURCES
    • Community
    • Team
    • Blogs
    • Presentations
    • Videos
  • Integrations
    • Tableau
    • Trino
    • ThirdEye
    • Superset
    • Presto
Powered by GitBook
On this page
  • Math Functions
  • String Functions
  • DateTime Functions
  • JSON Functions
  • Transform Functions
  • Scalar Functions
  • Binary Functions
  • Multi-value Column Functions
  • Advanced Queries
  • Geospatial Queries
  • Text Queries

Was this helpful?

Export as PDF
  1. For Users
  2. Query

Transformation Functions

This document contains the list of all the transformation functions supported by Pinot SQL.

Math Functions

Function

String Functions

Multiple string functions are supported out of the box from release-0.5.0 .

Function

DateTime Functions

Date time functions allow you to perform transformations on columns that contain timestamps or dates.

Function

JSON Functions

Transform Functions

These functions can only be used in Pinot SQL queries.

Function

Scalar Functions

These functions can be used for column transformation in table ingestion configs.

Function

Binary Functions

Function

Multi-value Column Functions

All of the functions mentioned till now only support single value columns. You can use the following functions to do operations on multi-value columns.

Function

MAP_VALUE Select the value for a key from Map stored in Pinot. MAP_VALUE(mapColumn, 'myKey', valueColumn)

Advanced Queries

Geospatial Queries

Text Queries

PreviousFiltering with IdSetNextAggregation Functions

Last updated 3 years ago

Was this helpful?

Sum of at least two values

Difference between two values

Product of at least two values

Quotient of two values

Modulo of two values

Absolute of a value

Rounded up to the nearest integer.

Rounded down to the nearest integer.

Euler’s number(e) raised to the power of col.

Natural log of value i.e. ln(col1)

Square root of a value

(col) convert string to upper case

(col) convert string to lower case

(col) reverse the string

(col, startIndex, endIndex) Gets substring of the input string from start to endIndex. Index begins at 0. Set endIndex to -1 to calculate till end of the string

Concatenate two input strings using the seperator

trim spaces from both side of the string

trim spaces from left side of the string

trim spaces from right side of the string

calculate length of the string

Find Nth instance of find string in input. Returns 0 if input string is empty. Returns -1 if the Nth instance is not found or input string is null.

returns true if columns starts with prefix string.

replace all instances of find with replace in input

string padded from the right side with pad to reach final size

string padded from the left side with pad to reach final size

the Unicode codepoint of the first character of the string

the character corresponding to the Unicode codepoint

Extracts values that match the provided regular expression

removes all instances of search from string

Converts the value into another time unit. the column should be an epoch timestamp.

Converts the value into another date time format, and buckets time based on the given time granularity.

Converts the value into a specified output granularity seconds since UTC epoch that is bucketed on a unit in a specified timezone.

Convert epoch milliseconds to epoch <Time Unit>.

Convert epoch milliseconds to epoch <Time Unit>, round to nearest rounding bucket(Bucket size is defined in <Time Unit>).

Convert epoch milliseconds to epoch <Time Unit>, and divided by bucket size(Bucket size is defined in <Time Unit>).

Convert epoch <Time Unit> to epoch milliseconds.

Convert epoch <Bucket Size><Time Unit> to epoch milliseconds.

Convert epoch millis value to DateTime string represented by pattern.

Convert DateTime string represented by pattern to epoch millis.

Round the given time value to nearest bucket start value.

Return current time as epoch millis

Returns the hour of the time zone offset.

Returns the minute of the time zone offset.

Returns the year from the given epoch millis in UTC timezone.

Returns the year from the given epoch millis and timezone id.

Returns the year of the ISO week from the given epoch millis in UTC timezone. Alias yowis also supported.

Returns the year of the ISO week from the given epoch millis and timezone id. Alias yowis also supported.

Returns the quarter of the year from the given epoch millis in UTC timezone. The value ranges from 1 to 4.

Returns the quarter of the year from the given epoch millis and timezone id. The value ranges from 1 to 4.

Returns the month of the year from the given epoch millis in UTC timezone. The value ranges from 1 to 12.

Returns the month of the year from the given epoch millis and timezone id. The value ranges from 1 to 12.

Returns the ISO week of the year from the given epoch millis in UTC timezone. The value ranges from 1 to 53. Alias weekOfYear is also supported.

Returns the ISO week of the year from the given epoch millis and timezone id. The value ranges from 1 to 53. Alias weekOfYear is also supported.

Returns the day of the year from the given epoch millis in UTC timezone. The value ranges from 1 to 366. Alias doy is also supported.

Returns the day of the year from the given epoch millis and timezone id. The value ranges from 1 to 366. Alias doy is also supported.

Returns the day of the month from the given epoch millis in UTC timezone. The value ranges from 1 to 31. Alias dayOfMonth is also supported.

Returns the day of the month from the given epoch millis and timezone id. The value ranges from 1 to 31. Alias dayOfMonth is also supported.

Returns the day of the week from the given epoch millis in UTC timezone. The value ranges from 1(Monday) to 7(Sunday). Alias dow is also supported.

Returns the day of the week from the given epoch millis and timezone id. The value ranges from 1(Monday) to 7(Sunday). Alias dow is also supported.

Returns the hour of the day from the given epoch millis in UTC timezone. The value ranges from 0 to 23.

Returns the hour of the day from the given epoch millis and timezone id. The value ranges from 0 to 23.

Returns the minute of the hour from the given epoch millis in UTC timezone. The value ranges from 0 to 59.

Returns the minute of the hour from the given epoch millis and timezone id. The value ranges from 0 to 59.

Returns the second of the minute from the given epoch millis in UTC timezone. The value ranges from 0 to 59.

Returns the second of the minute from the given epoch millis and timezone id. The value ranges from 0 to 59.

Returns the millisecond of the second from the given epoch millis in UTC timezone. The value ranges from 0 to 999.

Returns the millisecond of the second from the given epoch millis and timezone id. The value ranges from 0 to 999.

Evaluates the 'jsonPath' on jsonField, returns the result as the type 'resultsType', use optional defaultValuefor null or parsing error.

Extracts all matched JSON field keys based on 'jsonPath' into a STRING_ARRAY.

Convert object to JSON String

Extracts the object value from jsonField based on 'jsonPath', the result type is inferred based on JSON value. Cannot be used in query because data type is not specified.

Extracts the Long value from jsonField based on 'jsonPath', use optional defaultValuefor null or parsing error.

Extracts the Double value from jsonField based on 'jsonPath', use optional defaultValuefor null or parsing error.

Extracts the String value from jsonField based on 'jsonPath', use optional defaultValuefor null or parsing error.

Extracts an array from jsonField based on 'jsonPath', the result type is inferred based on JSON value. Cannot be used in query because data type is not specified.

Extracts an array from jsonField based on 'jsonPath', the result type is inferred based on JSON value. Returns empty array for null or parsing error. Cannot be used in query because data type is not specified.

Return SHA-1 digest of binary column(bytes type) as hex string

Return SHA-256 digest of binary column(bytes type) as hex string

Return SHA-512 digest of binary column(bytes type) as hex string

Return MD5 digest of binary column(bytes type) as hex string

Returns the length of a multi-value

The transform function will filter the value from the multi-valued column with the given constant values. The VALUEIN transform function is especially useful when the same multi-valued column is both filtering column and grouping column.

Pinot supports Geospatial queries on columns containing text-based geographies. For more details on the queries and how to enable them, see .

Pinot supports pattern matching on text-based columns. Only the columns mentioned as text columns in table config can be queried using this method. For more details on how to enable pattern matching, see .

Geospatial
Text search support
ADD(col1, col2, col3...)
SUB(col1, col2)
MULT(col1, col2, col3...)
DIV(col1, col2)
MOD(col1, col2)
ABS(col1)
FLOOR(col1)
EXP(col1)
LN(col1)
SQRT(col1)
UPPER
LOWER
REVERSE
SUBSTR
CONCAT(col1, col2, seperator)
TRIM(col)
LTRIM(col)
RTRIM(col)
LENGTH(col)
STRPOS(col, find, N)
STARTSWITH(col, prefix)
REPLACE(col, find, substitute)
RPAD(col, size, pad)
LPAD(col, size, pad)
CODEPOINT(col)
CHR(codepoint)
regexpExtract(value, regexp)
remove(input, search)
TIMECONVERT(col, fromUnit, toUnit)
DATETIMECONVERT(columnName, inputFormat, outputFormat, outputGranularity)
DATETRUNC
ToEpoch<TIME_UNIT>(timeInMillis)
ToEpoch<TIME_UNIT>Rounded(timeInMillis, bucketSize)
ToEpoch<TIME_UNIT>Bucket(timeInMillis, bucketSize)
FromEpoch<TIME_UNIT>
(timeIn<Time_UNIT>)
FromEpoch<TIME_UNIT>Bucket(timeIn<Time_UNIT>, bucketSizeIn<Time_UNIT>)
ToDateTime(timeInMillis, pattern[, timezoneId])
FromDateTime(dateTimeString, pattern)
round(timeValue, bucketSize)
now()
timezoneHour(timeZoneId)
timezoneMinute(timeZoneId)
year(tsInMillis)
year(tsInMillis, timeZoneId)
yearOfWeek(tsInMillis)
yearOfWeek(tsInMillis, timeZoneId)
quarter(tsInMillis)
quarter(tsInMillis, timeZoneId)
month(tsInMillis)
month(tsInMillis, timeZoneId)
week(tsInMillis)
week(tsInMillis, timeZoneId)
dayOfYear(tsInMillis)
dayOfYear(tsInMillis, timeZoneId)
day(tsInMillis)
day(tsInMillis, timeZoneId)
dayOfWeek(tsInMillis)
dayOfWeek(tsInMillis, timeZoneId)
hour(tsInMillis)
hour(tsInMillis, timeZoneId)
minute(tsInMillis)
minute(tsInMillis, timeZoneId)
second(tsInMillis)
second(tsInMillis, timeZoneId)
millisecond(tsInMillis)
millisecond(tsInMillis, timeZoneId)
JSONEXTRACTSCALAR(jsonField, 'jsonPath', 'resultsType', [defaultValue])
JSONEXTRACTKEY
(jsonField, 'jsonPath')
TOJSONMAPSTR(map) Convert map to JSON String
JSONFORMAT(object)
JSONPATH(jsonField, 'jsonPath')
JSONPATHLONG(jsonField, 'jsonPath', [defaultValue])
JSONPATHDOUBLE(jsonField, 'jsonPath', [defaultValue])
JSONPATHSTRING(jsonField, 'jsonPath', [defaultValue])
JSONPATHARRAY(jsonField, 'jsonPath')
JSONPATHARRAYDEFAULTEMPTY(jsonField, 'jsonPath')
SHA(bytesCol)
SHA256(bytesCol)
SHA512(bytesCol)
MD5(bytesCol)
ARRAYLENGTH
VALUEIN
CEIL(col1)