1 of 100

Functions

This page contains reference documentation for functions in Apache Pinot.

ABS

This section contains reference documentation for the abs function.

Absolute of a value

Signature

ABS(col1)

Usage Examples

select ABS(-12.1) AS value
from ignoreMe

value

select ABS(12.1) AS value
from ignoreMe

value

ADD

This section contains reference documentation for the ADD function.

Sum of at least two values

Signature

ADD(col1, col2, col3...)

Usage Examples

These examples are based on the Batch Quick Start.

select homeRuns, baseOnBalls, ADD(homeRuns, baseOnBalls) AS total
from baseballStats 
WHERE teamID = 'ML1' 
AND yearID = 1956 
AND playerName = 'Henry Louis'

homeRuns

baseOnBalls

total

ago

This section contains reference documentation for the ago function.

Return time as epoch millis before the given period (in ISO-8601 duration format).

Examples:

"PT20.345S" -- parses as "20.345 seconds"
"PT15M" -- parses as "15 minutes" (where a minute is 60 seconds)
"PT10H" -- parses as "10 hours" (where an hour is 3600 seconds)
"P2D" -- parses as "2 days" (where a day is 24 hours or 86400 seconds)
"P2DT3H4M" -- parses as "2 days, 3 hours and 4 minutes"
"P-6H3M" -- parses as "-6 hours and +3 minutes"
"-P6H3M" -- parses as "-6 hours and -3 minutes"
"-P-6H+3M" -- parses as "+6 hours and -3 minutes"

Signature

ago()

Usage Examples

oneDayAgo

This function is typically used in the predicate to filter on timestamps for recent data. e.g. filter data on recent 1 day.

ARG_MIN / ARG_MAX

This section contains reference documentation for the ARG_MIN and ARG_MAX function.

This function scans the given dataset to identify the maximum and minimum values in the specified measuring columns. Once these extreme values (the maxima and minima) are found, the function locates the corresponding entries in the projection column. These entries are associated with the rows where the extreme values were found in the measuring columns. The function then returns these projection column values, providing a way to link the extreme measurements with their corresponding data in another part of the dataset.

Signature

ARG_MIN (measuringCol1, measuringCol2, measuringCol3, projectionCol)
ARG_MAX (measuringCol1, measuringCol2, measuringCol3, projectionCol)

Usage Examples

Find the user with maximum activity. If there are multiple users, break the tie with their last_activity_date. If still a tie, break with user_id. And project user_id.

SELECT ARG_MAX(activity, last_activity_date, user_id, user_id)
FROM userEngagmentTable

More useful is that this multiple such aggregation function can be used with GROUP BY

SELECT user_region, ARG_MAX(activity, last_activity_date, user_id, user_id),
    ARG_MIN(user_satisfaction, user_id)
FROM userEngagmentTable
GROUP BY user_region

Note:

In cases where multiple rows share the same extreme values in the measuring columns, all such rows will be returned by the function.
If the goal is to project multiple different columns that correspond to the same set of measuring columns, you can achieve this by invoking the function multiple times, each time specifying a different projection column.
This impl does not work with AS clause (e.g. SELECT argmin(longCol, doubleCol) AS argmin won't work)
Putting argmin/argmax column inside order by clause (e.g. SELECT intCol, argmin(longCol, doubleCol) FROM table GROUP BY intCol ORDER BY argmin(longCol, doubleCol)) is not supported as semantically ordering multi-column multi-row argmin/argmax results doesn't make sense
Currently projecting MV bytes column doesn't work for now due to an issue

For more detailed examples, see: https://github.com/apache/pinot/pull/10636

arrayConcatDouble

This section contains reference documentation for the arrayConcatDouble function.

Concatenates two arrays of doubles.

Signature

arrayConcatDouble('colName1', 'colName2')

Usage Examples

This example assumes the multiValueTable columns mvCol1 and mvCol2 are both of type DOUBLE with singleValueField in the table schema set to false.

select mvCol1, 
       arrayConcatDouble(mvCol1, mvCol2) AS concatDoubles
from multiValueTable
WHERE arraylength(mvCol1) >= 2
limit 5

arrayConcatFloat

This section contains reference documentation for the arrayConcatFloat function.

Concatenates two arrays of floats.

Signature

arrayConcatFloat('colName1', 'colName2')

Usage Examples

This example assumes the multiValueTable columns mvCol1 and mvCol2 are both of type FLOAT with singleValueField in the table schema set to false.

select mvCol1, 
       arrayConcatFloat(mvCol1, mvCol2) AS concatFloats
from multiValueTable
WHERE arraylength(mvCol1) >= 2
limit 5

arrayConcatInt

This section contains reference documentation for the arrayConcatInt function.

Concatenates two arrays of ints.

Signature

arrayConcatInt('colName1', 'colName2')

Usage Examples

These examples are based on the Hybrid Quick Start.

select DivWheelsOffs, 
       arrayConcatInt(DivWheelsOffs, DivWheelsOns) AS concatIds
from airlineStats 
WHERE arraylength(DivWheelsOffs) >= 2
limit 5

DivWheelsOffs

concatIds

arrayConcatLong

This section contains reference documentation for the arrayConcatLong function.

Concatenates two arrays of longs.

Signature

arrayConcatLong('colName1', 'colName2')

Usage Examples

This example assumes the multiValueTable columns mvCol1 and mvCol2 are both of type LONG with singleValueField in the table schema set to false.

select mvCol1, 
       arrayConcatLong(mvCol1, mvCol2) AS concatLongs
from multiValueTable
WHERE arraylength(mvCol1) >= 2
limit 5

arrayConcatString

This section contains reference documentation for the arrayConcatString function.

Concatenates two arrays of strings.

Signature

arrayConcatString('colName1', 'colName2')

Usage Examples

These examples are based on the Hybrid Quick Start.

select DivTailNums, 
       arrayConcatString(DivTailNums, DivTailNums) AS concatIds
from airlineStats 
WHERE arraylength(DivTailNums) >= 2
limit 5

DivTailNums

concatIds

arrayContainsInt

This section contains reference documentation for the arrayContainsInt function.

Checks if int value exists in array.

Signature

arrayContainsInt('colName', valueToFind)

Usage Examples

These examples are based on the .

DivAirportIDs

containsValue

arrayContainsString

This section contains reference documentation for the arrayContainsString function.

Checks if string value exists in array.

Signature

arrayContainsString('colName', valueToFind)

Usage Examples

These examples are based on the Hybrid Quick Start.

select DivTailNums, 
       arrayContainsString(DivTailNums, 'N7713A') AS index
from airlineStats 
WHERE arraylength(DivTailNums) >= 2
limit 5

DivTailNums

index

arrayDistinctInt

This section contains reference documentation for the arrayDistinctInt function.

Returns unique values in an array of ints.

Signature

arrayDistinctInt('colName')

Usage Examples

These examples are based on the .

DivAirportIDs

unique

arrayDistinctString

This section contains reference documentation for the arrayDistinctString function.

Returns unique values in an array of strings.

Signature

arrayDistinctString('colName')

Usage Examples

These examples are based on the .

DivTailNums

unique

arrayIndexOfInt

This section contains reference documentation for the arrayIndexOfInt function.

Finds the last index of the given value in the array starting at the given index.

Signature

arrayIndexOfInt('colName', valueToFind)

Usage Examples

These examples are based on the .

DivAirportIDs

index

arrayIndexOfString

This section contains reference documentation for the arrayIndexOfString function.

Finds the last index of the given value in the array starting at the given index.

Signature

arrayIndexOfString('colName', valueToFind)

Usage Examples

These examples are based on the Hybrid Quick Start.

select DivTailNums, 
       arrayIndexOfString(DivTailNums, 'N7713A') AS index
from airlineStats 
WHERE arraylength(DivTailNums) >= 2
limit 5

DivTailNums

index

ARRAYLENGTH

This section contains reference documentation for the ARRAYLENGTH function.

Returns the length of a multi-value column

Signature

ARRAYLENGTH('colName')

Usage Examples

These examples are based on the .

length

count(*)

The count(*) values will increase each time we execute the query as data is constantly being ingested by the Hybrid Quick Start.

arrayRemoveInt

This section contains reference documentation for the arrayRemoveInt function.

Removes value from array of ints.

Signature

arrayRemoveInt('colName', value)

Usage Examples

These examples are based on the .

DivAirportIDs

value

arrayRemoveString

This section contains reference documentation for the arrayRemoveString function.

Removes value from array of strings.

Signature

arrayRemoveString('colName', value)

Usage Examples

These examples are based on the .

DivAirportIDs

value

arrayReverseInt

This section contains reference documentation for the arrayReverseInt function.

Reverses array of ints.

Signature

arrayReverseInt('colName')

Usage Examples

These examples are based on the .

DivAirportIDs

reversedIds

arrayReverseString

This section contains reference documentation for the arrayReverseString function.

Reverses array of strings.

Signature

arrayReverseString('colName')

Usage Examples

These examples are based on the Hybrid Quick Start.

select FlightNum, 
       arrayReverseString(RandomAirports) AS reversedAirports, 
       RandomAirports
from airlineStats 
WHERE arraylength(RandomAirports) BETWEEN 2 AND 4
limit 5

FlightNum

reversedAirports

RandomAirports

arraySliceInt

This section contains reference documentation for the arraySliceInt function.

Returns the values in the array between the start and end positions.

Signature

arraySliceInt('colName', start, end)

Usage Examples

These examples are based on the Hybrid Quick Start.

select FlightNum, 
       arraySliceInt(DivAirportIDs, 0, 1) AS airports, 
	     DivAirportIDs
from airlineStats 
WHERE arraylength(DivAirportIDs) >= 2
limit 5

FlightNum

airports

DivAirportIDs

arraySliceString

This section contains reference documentation for the arraySliceString function.

Returns the values in the array between the start and end positions.

Signature

arraySliceString('colName', start, end)

Usage Examples

These examples are based on the .

FlightNum

airports

RandomAirports

arraySortInt

This section contains reference documentation for the arraySortInt function.

Sorts array of ints.

Signature

arraySortInt('colName')

Usage Examples

These examples are based on the Hybrid Quick Start.

select DivAirportIDs, 
       arraySortInt(DivAirportIDs) AS sortedIds
from airlineStats 
WHERE arraylength(DivAirportIDs) >= 2
limit 5

DivAirportIDs

sortedIds

arraySortString

This section contains reference documentation for the arraySortString function.

Sorts array of strings.

Signature

arraySortString('colName')

Usage Examples

These examples are based on the Hybrid Quick Start.

select FlightNum, 
       arraySortString(RandomAirports) AS sortedAirports, 
       RandomAirports
from airlineStats 
WHERE arraylength(RandomAirports) BETWEEN 2 AND 4
limit 5

FlightNum

sortedAirports

RandomAirports

arrayUnionInt

This section contains reference documentation for the arrayUnionInt function.

Create a union of two arrays of ints.

Signature

arrayUnionInt('colName1', 'colName2')

Usage Examples

These examples are based on the Hybrid Quick Start.

select DivWheelsOffs, 
       DivWheelsOns,
       arrayUnionInt(DivWheelsOffs, DivWheelsOns) AS unionIds
from airlineStats 
WHERE arraylength(DivWheelsOffs) >= 2
limit 5

DivWheelsOffs

DivWheelsOns

unionIds

arrayUnionString

This section contains reference documentation for the arrayUnionString function.

Create a union of two arrays of strings.

Signature

arrayUnionString('colName1', 'colName2')

Usage Examples

These examples are based on the Hybrid Quick Start.

select DivTailNums, 
       DivAirports,
       arrayUnionString(DivTailNums, DivAirports) AS unionIds
from airlineStats 
WHERE arraylength(DivTailNums) >= 2
limit 5

DivTailNums

DivAirports

unionIds

AVGMV

This section contains reference documentation for the AVGMV function.

Get the avg of values in a group

Signature

AVGMV(colName)

Usage Examples

These examples are based on the Hybrid Quick Start.

select AVGMV(DivLongestGTimes) AS value
from airlineStats 
where arraylength(DivLongestGTimes) > 1

value

Base64

This section contains reference documentation for base64 encode and decode functions.

Encoding scheme follows java.util.Base64.Encoder

toBase64 returns Base64 encoded string of input binary data (bytes type).
fromBase64 returns binary data (represented as a Hex string) from Base64-encoded string.

Signature

toBase64(bytesCol)
fromBase64(stringCol)

Usage Examples

For better readability, the following examples converts string hello! into BYTES using toUtf8 function and converts the decoded BYTES into string using fromUtf8.

SELECT toBase64(toUtf8('hello!')) AS encoded
FROM ignoreMe

encoded

SELECT fromUtf8(fromBase64('aGVsbG8h')) AS decoded
FROM ignoreMe

decoded

Note that without UTF8 string conversion, returned BYTES will be represented as a Hex string following Pinot's BYTES column representation. See the example below.

SELECT fromBase64('aGVsbG8h') AS decoded
FROM ignoreMe

Note that the following query will throw compilation error as string is not a valid input type for toBase64.

SELECT toBase64('hello!') AS encoded
FROM ignoreMe

caseWhen

This section contains reference documentation for the caseWhen function.

Returns values depending on boolean expressions. This function can only be used in an ingestion transformation function.

Signature

caseWhen(booleanExpr1, valueIfExpr1True, booleanExpr2, valueIfExpr2True) caseWhen(booleanExpr1, valueIfExpr1True, booleanExpr2, valueIfExpr2True, ... ,valueIfFalse)

Arguments

Description

Usage Examples

The usage examples are based on extracting fields from the following JSON documents:

{
  "latitude": 1.0
}

Expression

Value

This function can be used in the table config to add northernHemisphere column:

{
   "tableConfig":{
      "ingestionConfig":{
         "transformConfigs":[
            {
               "columnName":"northernHemisphereStr",
               "transformFunction":"CASEWHEN(latitude > 0, 'North', 'South')"
            }
         ]
      }
   }
}

ceil

This section contains reference documentation for the CEIL function.

Rounded up to the nearest integer.

Signature

CEIL(col1)

Usage Examples

select CEIL(12.1) AS value
from ignoreMe

value

select CEIL(-12.1) AS value
from ignoreMe

value

CHR

This section contains reference documentation for the CHR function.

the character corresponding to the Unicode codepoint

Signature

CHR(codepoint)

Usage Examples

SELECT CHR(65) AS value
FROM ignoreMe

value

codepoint

This section contains reference documentation for the CODEPOINT function.

the Unicode codepoint of the first character of the string

Signature

CODEPOINT(col)

Usage Examples

SELECT CODEPOINT('Apache Pinot') AS value
FROM ignoreMe

value

concat

This section contains reference documentation for the concat function.

Concatenate two input strings using the seperator

Signature

CONCAT(col1, col2, seperator)

Usage Examples

value

count

This section contains reference documentation for the count function.

Get the count of rows in a group

Signature

COUNT(colName)

Usage Examples

These examples are based on the Batch Quick Start.

select count(*) AS value
from baseballStats

value

COUNTMV

This section contains reference documentation for the COUNTMV function.

Get the count of rows in a group

Signature

COUNTMV(colName)

Signature

day(tsInMillis)
day(tsInMillis, timeZoneId)
dayOfMonth(tsInMillis)
dayOfMonth(tsInMillis, timeZoneId)

Usage Examples

day

dayOfWeek

This section contains reference documentation for the dayOfWeek function.

Returns the day of the week from the given epoch millis in UTC timezone. The value ranges from 1(Monday) to 7(Sunday).

Signature

dayOfWeek(tsInMillis)
dayOfWeek(tsInMillis, timeZoneId)
dow(tsInMillis)
dow(tsInMillis, timeZoneId)

Usage Examples

dayOfWeek

dayOfYear

This section contains reference documentation for the dayOfYear function.

Returns the day of the year from the given epoch millis in UTC or specified timezone. The value ranges from 1 to 366.

Signature

dayOfYear(tsInMillis)
dayOfYear(tsInMillis, timeZoneId)
doy(tsInMillis)
doy(tsInMillis, timeZoneId)

Usage Examples

dayOfYear

DISTINCT

This section contains reference documentation for the DISTINCT function.

Returns the distinct row values in a group

Signature

DISTINCT(colName)

inputFormat and outputFormat are defined using the following structure:

<time size>:<time unit>:<time format>:<pattern>

where:

time size - size of the time unit eg: 1, 10
time unit - DAYS, HOURS, MINUTES, SECONDS, MILLISECONDS, MICROSECONDS, NANOSECONDS
time format
- EPOCH
- SIMPLE_DATE_FORMAT pattern - defined in case of SIMPLE_DATE_FORMAT e.g. yyyy-MM-dd. A specific timezone can be passed using tz(timezone). Timezone can be long or short string format timezone. e.g. Asia/Kolkata or PDT

granularity is specified in the format <time size>:<time unit>.

Usage Examples

These examples are based on the Batch JSON Quick Start.

created_at_timestamp from milliseconds since epoch to days since epoch, bucketed to 1 day granularity:

select id, 
       created_at_timestamp, 
       cast(created_at_timestamp AS long) AS timeInMs,
       DATETIMECONVERT(
         created_at_timestamp, 
         '1:MILLISECONDS:EPOCH', 
         '1:DAYS:EPOCH', 
         '1:DAYS'
       ) AS convertedTime
from githubEvents
WHERE id = 7044874134

created_at_timestamp bucketed to 15 minutes granularity:

select id, 
       created_at_timestamp, 
       cast(created_at_timestamp AS long) AS timeInMs,
       DATETIMECONVERT(
         created_at_timestamp, 
         '1:MILLISECONDS:EPOCH', 
         '1:MILLISECONDS:EPOCH', 
         '15:MINUTES'
       ) AS convertedTime
from githubEvents
WHERE id = 7044874134

created_at_timestamp to format yyyy-MM-dd, bucketed to 1 days granularity:

select id, 
       created_at_timestamp, 
       cast(created_at_timestamp AS long) AS timeInMs,
       DATETIMECONVERT(
         created_at_timestamp, 
         '1:MILLISECONDS:EPOCH', 
         '1:DAYS:SIMPLE_DATE_FORMAT:yyyy-MM-dd', 
         '1:DAYS'
       ) AS convertedTime
from githubEvents
WHERE id = 7044874134

created_at_timestamp to format yyyy-MM-dd HH:mm, in timezone Pacific/Kiritimati:

select id, 
       created_at_timestamp, 
       cast(created_at_timestamp AS long) AS timeInMs,
       DATETIMECONVERT(
         created_at_timestamp, 
         '1:MILLISECONDS:EPOCH', 
         '1:MILLISECONDS:SIMPLE_DATE_FORMAT:yyyy-MM-dd HH:mm tz(Pacific/Kiritimati)', 
         '1:MILLISECONDS'
       ) AS convertedTime
from githubEvents
WHERE id = 7044874134

created_at_timestamp to format yyyy-MM-dd, in timezone Pacific/Kiritimati and bucketed to 1 day granularity:

select id, 
       created_at_timestamp, 
       cast(created_at_timestamp AS long) AS timeInMs,
       DATETIMECONVERT(
         created_at_timestamp, 
         '1:MILLISECONDS:EPOCH', 
         '1:MILLISECONDS:SIMPLE_DATE_FORMAT:yyyy-MM-dd HH:mm tz(Pacific/Kiritimati)', 
         '1:DAYS'
       ) AS convertedTime
from githubEvents
WHERE id = 7044874134