arrow-left

All pages
gitbookPowered by GitBook
triangle-exclamation
Couldn't generate the PDF for 149 pages, generation stopped at 100.
Extend with 50 more pages.
1 of 100

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

ABS

This section contains reference documentation for the abs function.

Absolute of a value

hashtag
Signature

ABS(col1)

hashtag
Usage Examples

value
value

ADD

This section contains reference documentation for the ADD function.

Sum of at least two values

hashtag
Signature

ADD(col1, col2, col3...)

hashtag
Usage Examples

These examples are based on the .

homeRuns
baseOnBalls
total
select ABS(-12.1) AS value
from ignoreMe

12.1

select ABS(12.1) AS value
from ignoreMe

12.1

select homeRuns, baseOnBalls, ADD(homeRuns, baseOnBalls) AS total
from baseballStats 
WHERE teamID = 'ML1' 
AND yearID = 1956 
AND playerName = 'Henry Louis'

26

37

63

Batch Quick Start

arrayConcatDouble

This section contains reference documentation for the arrayConcatDouble function.

Concatenates two arrays of doubles.

hashtag
Signature

arrayConcatDouble('colName1', 'colName2')

hashtag
Usage Examples

This example assumes the multiValueTable columns mvCol1 and mvCol2 are both of type DOUBLE with singleValueField in the table schema set to false.

arrayConcatFloat

This section contains reference documentation for the arrayConcatFloat function.

Concatenates two arrays of floats.

hashtag
Signature

arrayConcatFloat('colName1', 'colName2')

hashtag
Usage Examples

This example assumes the multiValueTable columns mvCol1 and mvCol2 are both of type FLOAT with singleValueField in the table schema set to false.

arrayConcatLong

This section contains reference documentation for the arrayConcatLong function.

Concatenates two arrays of longs.

hashtag
Signature

arrayConcatLong('colName1', 'colName2')

hashtag
Usage Examples

This example assumes the multiValueTable columns mvCol1 and mvCol2 are both of type LONG with singleValueField in the table schema set to false.

ceil

This section contains reference documentation for the CEIL function.

Rounded up to the nearest integer.

hashtag
Signature

CEIL(col1)

hashtag
Usage Examples

value
value

count

This section contains reference documentation for the count function.

Get the count of rows in a group

hashtag
Signature

COUNT(colName)

hashtag
Usage Examples

These examples are based on the .

value

COVAR_SAMP

This section contains reference documentation for the COVAR_SAMP function.

Returns the sample covariance between of 2 numerical columns.

COVAR_SAMP(col1, col2) = COVAR_POP(col1, col2) * besselCorrection

hashtag
Signatures

COVAR_SAMP(col1, col2) -> double

hashtag
Usage Examples

These examples are based on the .

covariance

exp

This section contains reference documentation for the exp function.

Euler’s number(e) raised to the power of col.

hashtag
Signature

EXP(col1)

hashtag
Usage Examples

value
value

FLOOR

This section contains reference documentation for the FLOOR function.

Rounded down to the nearest integer.

hashtag
Signature

FLOOR(col1)

hashtag
Usage Examples

value
value

length

This section contains reference documentation for the length function.

calculate length of the string

hashtag
Signature

LENGTH(col)

hashtag
Usage Examples

value

lpad

This section contains reference documentation for the LPAD function.

string padded from the left side with pad to reach final size

hashtag
Signature

LPAD(col, size, pad)

hashtag
Usage Examples

value

MAXMV

This section contains reference documentation for the MAXMV function.

Get the maximum value in a group

hashtag
Signature

MAXMV(colName)

hashtag
Usage Examples

These examples are based on the .

value

Functions

This page contains reference documentation for functions in Apache Pinot.

This page contains reference documentation for functions in Apache Pinot.

arrayConcatInt

This section contains reference documentation for the arrayConcatInt function.

Concatenates two arrays of ints.

hashtag
Signature

arrayConcatInt('colName1', 'colName2')

arrayConcatString

This section contains reference documentation for the arrayConcatString function.

Concatenates two arrays of strings.

hashtag
Signature

arrayConcatString('colName1', 'colName2')

arrayContainsString

This section contains reference documentation for the arrayContainsString function.

Checks if string value exists in array.

hashtag
Signature

arrayContainsString('colName', valueToFind)

arrayDistinctInt

This section contains reference documentation for the arrayDistinctInt function.

Returns unique values in an array of ints.

hashtag
Signature

arrayDistinctInt('colName')

arrayDistinctString

This section contains reference documentation for the arrayDistinctString function.

Returns unique values in an array of strings.

hashtag
Signature

arrayDistinctString('colName')

arrayIndexOfInt

This section contains reference documentation for the arrayIndexOfInt function.

Finds the last index of the given value in the array starting at the given index.

hashtag
Signature

arrayIndexOfInt('colName', valueToFind)

ARRAYLENGTH

This section contains reference documentation for the ARRAYLENGTH function.

Returns the length of a multi-value column

hashtag
Signature

ARRAYLENGTH('colName')

arrayRemoveString

This section contains reference documentation for the arrayRemoveString function.

Removes value from array of strings.

hashtag
Signature

arrayRemoveString('colName', value)

arrayReverseString

This section contains reference documentation for the arrayReverseString function.

Reverses array of strings.

hashtag
Signature

arrayReverseString('colName')

arraySliceString

This section contains reference documentation for the arraySliceString function.

Returns the values in the array between the start and end positions.

hashtag
Signature

arraySliceString('colName', start, end)

arrayUnionString

This section contains reference documentation for the arrayUnionString function.

Create a union of two arrays of strings.

hashtag
Signature

arrayUnionString('colName1', 'colName2')

AVGMV

This section contains reference documentation for the AVGMV function.

Get the avg of values in a group

hashtag
Signature

AVGMV(colName)

codepoint

This section contains reference documentation for the CODEPOINT function.

the Unicode codepoint of the first character of the string

hashtag
Signature

CODEPOINT(col)

day

This section contains reference documentation for the day function.

Returns the day of the month from the given epoch millis in UTC or specified timezone. The value ranges from 1 to 31.

hashtag
Signature

day(tsInMillis)

day(tsInMillis, timeZoneId)

dayOfWeek

This section contains reference documentation for the dayOfWeek function.

Returns the day of the week from the given epoch millis in UTC timezone. The value ranges from 1(Monday) to 7(Sunday).

hashtag
Signature

dayOfWeek(tsInMillis)

dayOfWeek(tsInMillis, timeZoneId)

DISTINCT

This section contains reference documentation for the DISTINCT function.

Returns the distinct row values in a group

hashtag
Signature

DISTINCT(colName)

DISTINCTCOUNT

This section contains reference documentation for the DISTINCTCOUNT function.

Returns the count of distinct row values in a group

hashtag
Signature

DISTINCTCOUNT(colName)

DISTINCTCOUNTBITMAP

This section contains reference documentation for the DISTINCTCOUNTBITMAP function.

Returns the count of distinct row values in a group. This function is accurate for INT column, but approximate for other cases where hash codes are used in distinct counting and there may be hash collisions. For accurate distinct counting on all column types, see .

hashtag
Signature

DISTINCTCOUNTBITMAP(colName)

DISTINCTCOUNTBITMAPMV

This section contains reference documentation for the DISTINCTCOUNTBITMAPMV function.

Returns the count of distinct row values in a group. This function is accurate for INT or dictionary encoded column, but approximate for other cases where hash codes are used in distinct counting and there may be hash collision.

hashtag
Signature

DISTINCTCOUNTBITMAPMV(colName)

DISTINCTCOUNTRAWHLL

This section contains reference documentation for the DISTINCTCOUNTRAWHLL function.

Returns HLL response serialized as string. The serialized HLL can be converted back into an HLL and then aggregated with other HLLs. A common use case may be to merge HLL responses from different Pinot tables, or to allow aggregation after client-side batching.

hashtag
Signature

DISTINCTCOUNTRAWHLL(colName, log2m)

DISTINCTCOUNTMV

This section contains reference documentation for the DISTINCTCOUNTMV function.

Returns the count of distinct row values in a group

hashtag
Signature

DISTINCTCOUNTMV(colName)

DISTINCTCOUNTRAWHLLMV

This section contains reference documentation for the DISTINCTCOUNTRAWHLLMV function.

Returns HLL response serialized as string. The serialized HLL can be converted back into an HLL and then aggregated with other HLLs. A common use case may be to merge HLL responses from different Pinot tables, or to allow aggregation after client-side batching.

hashtag
Signature

DISTINCTCOUNTRAWHLLMV(colName, log2m)

DIV

This section contains reference documentation for the DIV function.

Quotient of two values

hashtag
Signature

DIV(col1, col2)

FromEpoch

This section contains reference documentation for the fromEpoch functions.

Convert epoch to epoch milliseconds. The following time units are supported:

  • SECONDS

  • MINUTES

isSubnetOf

This section contains reference documentation for the isSubnetOf function.

Takes 2 arguments of type STRING. The first argument is an ipPrefix, and the second argument is a single ipAddress. This function handles both IPv4 and IPv6 arguments.

Returns a boolean value checking if ipAddress is in the subnet of ipPrefix

hashtag

lower

This section contains reference documentation for the lower function.

Converts string to lower case.

hashtag
Signature

LOWER(col)

max

This section contains reference documentation for the max function.

Get the maximum value in a group

hashtag
Signature

MAX(colName)

MD5

This section contains reference documentation for the MD5 function.

Return MD5 digest of binary column(bytes type) as hex string

hashtag
Signature

MD5(bytesCol)

millisecond

This section contains reference documentation for the millisecond function.

Returns the millisecond of the second from the given epoch millis in UTC or specified timezone. The value ranges from 0 to 999.

hashtag
Signature

millisecond(tsInMillis)

millisecond(tsInMillis, timeZoneId)

min

This section contains reference documentation for the min function.

Get the minimum value in a group

hashtag
Signature

MIN(colName)

minmaxrange

This section contains reference documentation for the minmaxrange function.

Returns the max - min value in a group

hashtag
Signature

MINMAXRANGE(colName)

now

This section contains reference documentation for the now function.

Return current time as epoch millis.

hashtag
Signature

now()

MINMV

This section contains reference documentation for the MINMV function.

Get the minimum value in a group

hashtag
Signature

MINMV(colName)

ABSchevron-right
ADDchevron-right
arrayConcatDoublechevron-right
arrayConcatFloatchevron-right
arrayConcatIntchevron-right
arrayConcatLongchevron-right
arrayConcatStringchevron-right
arrayContainsIntchevron-right
arrayContainsStringchevron-right
arrayDistinctIntchevron-right
arrayDistinctStringchevron-right
arrayIndexOfIntchevron-right
arrayIndexOfStringchevron-right
ARRAYLENGTHchevron-right
arrayRemoveIntchevron-right
arrayRemoveStringchevron-right
arrayReverseIntchevron-right
arrayReverseStringchevron-right
arraySliceIntchevron-right
arraySliceStringchevron-right
arraySortIntchevron-right
arraySortStringchevron-right
arrayUnionIntchevron-right
arrayUnionStringchevron-right
AVGMVchevron-right
Base64chevron-right
ceilchevron-right
CHRchevron-right
codepointchevron-right
concatchevron-right
countchevron-right
COUNTMVchevron-right
DATETIMECONVERTchevron-right
DATETRUNCchevron-right
daychevron-right
dayOfWeekchevron-right
dayOfYearchevron-right
DISTINCTchevron-right
DISTINCTCOUNTchevron-right
DISTINCTCOUNTBITMAPchevron-right
DISTINCTCOUNTHLLMVchevron-right
DISTINCTCOUNTHLLchevron-right
DISTINCTCOUNTBITMAPMVchevron-right
DISTINCTCOUNTMVchevron-right
DISTINCTCOUNTRAWHLLchevron-right
DISTINCTCOUNTRAWHLLMVchevron-right
DISTINCTCOUNTRAWTHETASKETCHchevron-right
DISTINCTCOUNTTHETASKETCHchevron-right
DIVchevron-right
expchevron-right
FromDateTimechevron-right
FLOORchevron-right
FromEpochchevron-right
FromEpochBucketchevron-right
Histogramchevron-right
hourchevron-right
jsonextractkeychevron-right
jsonextractscalarchevron-right
JSONFORMATchevron-right
JSONPATHchevron-right
JSONPATHARRAYchevron-right
JSONPATHARRAYDEFAULTEMPTYchevron-right
JSONPATHDOUBLEchevron-right
JSONPATHLONGchevron-right
JSONPATHSTRINGchevron-right
lengthchevron-right
lnchevron-right
lowerchevron-right
lpadchevron-right
ltrimchevron-right
maxchevron-right
MAXMVchevron-right
MD5chevron-right
millisecondchevron-right
minchevron-right
minmaxrangechevron-right
MINMAXRANGEMVchevron-right
MINMVchevron-right
minutechevron-right
MODchevron-right
modechevron-right
monthchevron-right
multchevron-right
nowchevron-right
percentilechevron-right
percentileestchevron-right
percentileestmvchevron-right
percentiletdigestchevron-right
percentiletdigestmvchevron-right
percentilemvchevron-right
quarterchevron-right
regexpExtractchevron-right
removechevron-right
replacechevron-right
reversechevron-right
roundchevron-right
rpadchevron-right
rtrimchevron-right
secondchevron-right
SEGMENTPARTITIONEDDISTINCTCOUNTchevron-right
shachevron-right
sha256chevron-right
sha512chevron-right
sqrtchevron-right
startswithchevron-right
ST_AsBinarychevron-right
ST_AsTextchevron-right
ST_Containschevron-right
ST_Distancechevron-right
ST_GeogFromTextchevron-right
ST_GeogFromWKBchevron-right
ST_GeometryTypechevron-right
ST_GeomFromTextchevron-right
ST_GeomFromWKBchevron-right
STPOINTchevron-right
ST_Polygonchevron-right
strposchevron-right
ST_Unionchevron-right
SUBchevron-right
substrchevron-right
sumchevron-right
TIMECONVERTchevron-right
https://github.com/pinot-contrib/pinot-docs/blob/latest/configuration-reference/functions/timezoneHour.mdchevron-right
https://github.com/pinot-contrib/pinot-docs/blob/latest/configuration-reference/functions/timezoneMinute.mdchevron-right
ToDateTimechevron-right
https://github.com/pinot-contrib/pinot-docs/blob/latest/configuration-reference/functions/toEpoch.mdchevron-right
https://github.com/pinot-contrib/pinot-docs/blob/latest/configuration-reference/functions/toEpochBucket.mdchevron-right
https://github.com/pinot-contrib/pinot-docs/blob/latest/configuration-reference/functions/toEpochRounded.mdchevron-right
TOJSONMAPSTRchevron-right
toSphericalGeographychevron-right
trimchevron-right
upperchevron-right
Urlchevron-right
UTF8chevron-right
VALUEINchevron-right
weekchevron-right
yearchevron-right
https://github.com/pinot-contrib/pinot-docs/blob/latest/configuration-reference/functions/yearOfWeek.mdchevron-right
select mvCol1, 
       arrayConcatDouble(mvCol1, mvCol2) AS concatDoubles
from multiValueTable
WHERE arraylength(mvCol1) >= 2
limit 5
select mvCol1, 
       arrayConcatFloat(mvCol1, mvCol2) AS concatFloats
from multiValueTable
WHERE arraylength(mvCol1) >= 2
limit 5
select mvCol1, 
       arrayConcatLong(mvCol1, mvCol2) AS concatLongs
from multiValueTable
WHERE arraylength(mvCol1) >= 2
limit 5
select CEIL(12.1) AS value
from ignoreMe

13

select CEIL(-12.1) AS value
from ignoreMe

-12

select count(*) AS value
from baseballStats 

97889

Batch Quick Start
SELECT COVAR_SAMP(numberOfGames, AtBatting) AS covariance 
FROM baseballStats

8270.973200974102

Batch Quick Start
select EXP(1) AS value
from ignoreMe

2.718281828459045

select EXP(12) AS value
from ignoreMe

162754.79141900392

select FLOOR(12.1) AS value
from ignoreMe

12

select FLOOR(-12.1) AS value
from ignoreMe

-13

SELECT length('Pinot') AS value
FROM ignoreMe

5

SELECT LPAD('Hello, World', '20', '*') AS value
FROM ignoreMe

********Hello, World

select MAXMV(DivLongestGTimes) AS value
from airlineStats 
where arraylength(DivLongestGTimes) > 1

108

Hybrid Quick Start
hashtag
Usage Examples

These examples are based on the Hybrid Quick Start.

DivWheelsOffs
concatIds

1453,1731

1453,1731,1415,1623

1908,1758

1908,1758,1339,2310

1453,1731

1453,1731,1415,1623

1908,1758

1908,1758,1339,2310

hashtag
Usage Examples

These examples are based on the Hybrid Quick Start.

DivTailNums
concatIds

N7713A,N7713A

N7713A,N7713A,N7713A,N7713A

N344AA,N344AA

N344AA,N344AA,N344AA,N344AA

N344AA,N344AA

N344AA,N344AA,N344AA,N344AA

N7713A,N7713A

N7713A,N7713A,N7713A,N7713A

hashtag
Usage Examples

These examples are based on the Hybrid Quick Start.

DivTailNums
index

N7713A,N7713A

true

N344AA,N344AA

false

N7713A,N7713A

true

select DivTailNums, 
       arrayContainsString(DivTailNums, 'N7713A') AS index
from airlineStats 
WHERE arraylength(DivTailNums) >= 2
limit 5
hashtag
Usage Examples

These examples are based on the Hybrid Quick Start.

DivAirportIDs
unique

15016,11066

15016,11066

10620,14869

10620,14869

13891,12892

13891,12892

12264,10397

12264,10397

11066,12892

11066,12892

hashtag
Usage Examples

These examples are based on the Hybrid Quick Start.

DivTailNums
unique

N7713A,N7713A

N7713A

N344AA,N344AA

N344AA

N344AA,N344AA

N344AA

N7713A,N7713A

N7713A

hashtag
Usage Examples

These examples are based on the Hybrid Quick Start.

DivAirportIDs
index

13891,12892

-1

14683,14683

0

12339,12339

-1

13487,13930

-1

13029,11292

-1

hashtag
Usage Examples

These examples are based on the Hybrid Quick Start.

length
count(*)

1

5382

37

267

33

223

17

166

22

160

circle-info

The count(*) values will increase each time we execute the query as data is constantly being ingested by the Hybrid Quick Start.

hashtag
Usage Examples

These examples are based on the Hybrid Quick Start.

DivAirportIDs
value

SEA,PSC

PSC

SEA,PSC,PHX,MSY

PSC,PHX,MSY

SEA,PSC,PHX,MSY

PSC,PHX,MSY

SEA,PSC

PSC

SEA,PSC

PSC

hashtag
Usage Examples

These examples are based on the Hybrid Quick Start.

value

18.465753424657535

select AVGMV(DivLongestGTimes) AS value
from airlineStats 
where arraylength(DivLongestGTimes) > 1
hashtag
Usage Examples
value

65

SELECT CODEPOINT('Apache Pinot') AS value
FROM ignoreMe
dayOfMonth(tsInMillis)

dayOfMonth(tsInMillis, timeZoneId)

hashtag
Usage Examples

day

12

day

13

day

12

day

13

select day(1639351800000) AS day
FROM ignoreMe
dow(tsInMillis)

dow(tsInMillis, timeZoneId)

hashtag
Usage Examples

dayOfWeek

7

dayOfWeek

1

dayOfWeek

7

dayOfWeek

1

select dayOfWeek(1639351800000) AS dayOfWeek
FROM ignoreMe
hashtag
Usage Examples

These examples are based on the Batch Quick Start.

value

NL

UA

AL

NA

PL

AA

FL

value

NL

UA

AL

NA

PL

AA

FL

hashtag
Usage Examples

These examples are based on the Batch Quick Start.

value

7

value

149

select DISTINCTCOUNT(league) AS value
from baseballStats 
select DISTINCTCOUNT(teamID) AS value
from baseballStats 
hashtag
Usage Examples

These examples are based on the Batch Quick Start.

value

7

value

148

DISTINCTCOUNT
select DISTINCTCOUNTBITMAP(league) AS value
from baseballStats 
hashtag
Usage Examples

These examples are based on the Hybrid Quick Start.

value

34

value

2

select DISTINCTCOUNTBITMAPMV(DivLongestGTimes) AS value
from airlineStats 
where arraylength(DivLongestGTimes) > 1
hashtag
Usage Examples

These examples are based on the Batch Quick Start.

value

00000008000000ac00000800000084000210000000000020001020220030042002100420002010020210000300008020040180400001300310001863024004220870800004400421040104610220080000020000040000030000800002108420000110400800000106000060000080020000082000218c0002000000020000010200100000018c0006000400022004a0000088000200800000320820021000000221842000000000025088000220080100009420

value

000000010000000400000106

select DISTINCTCOUNTRAWHLL(teamID) AS value
from baseballStats 
hashtag
Usage Examples

These examples are based on the Hybrid Quick Start.

The following query returns the documents that have a DivTailNums with more than one value:

DivTailNums

N7713A,N7713A

N344AA,N344AA

N344AA,N344AA

N7713A,N7713A

You can count the distinct number of items in these rows by running the following query:

value

2

select DivTailNums
from airlineStats 
where arraylength(DivTailNums) > 1
hashtag
Usage Examples

These examples are based on the Hybrid Quick Start.

value

00000008000000ac00000000000000000000000500000020000000000030000202108000040000010000000300010400000000000000000000000463000000000000000000010001041000200000002000000000000000000a00000000028001000000010800000000010000001008000000804000000000020000040000880000000000000000000000000000000000000000000000800000000800020004000000840000000002000000000000000000001400

value

0000000100000004000000e4

select DISTINCTCOUNTRAWHLLMV(DivAirports) AS value
from airlineStats 
where arraylength(DivAirports) > 1
hashtag
Usage Examples

These examples are based on the Batch Quick Start.

homeRuns
numberOfGames
total

26

153

0.16993464052287582

select homeRuns, numberOfGames, DIV(homeRuns, numberOfGames) AS total
from baseballStats 
WHERE teamID = 'ML1' 
AND yearID = 1956 
AND playerName = 'Henry Louis'
HOURS
  • DAYS

  • hashtag
    Signature

    FromEpoch<TIME_UNIT>(timeIn<Time_UNIT>)

    hashtag
    Usage Examples

    epochMillis

    1613472303000

    epochMillis

    1613472300000

    epochMillis

    1613469600000

    epochMillis

    1613433600000

    Signatures

    isSubnetOf(ipPrefix, ipAddress) -> boolean

    hashtag
    Usage Examples

    Please see the following sample queries where isSubnetOf is used in different parts of the query.

    SELECT isSubnetOf('192.168.0.1/24', '192.168.0.225') 
    AS result
    FROM myTable;
    ---> returns true
    
    SELECT isSubnetOf('1.2.3.128/26', '1.2.5.1') 
    AS result
    FROM myTable;
    ---> returns false
    SELECT isSubnetOf('2001:4800:7825:103::/64', '2001:4800:7825:103::2050')
    AS result
    FROM myTable;
    ---> returns true
    
    SELECT isSubnetOf('7890:db8:113::8a2e:370:7334/127', '7890:db8:113::8a2e:370:7336') 
    AS result
    FROM myTable;
    ---> returns false
    hashtag
    Usage Examples
    name

    pinot

    select LOWER('Pinot') AS name
    FROM ignoreMe
    hashtag
    Usage Examples

    These examples are based on the Batch Quick Start.

    value

    73

    select max(homeRuns) AS value
    from baseballStats 
    hashtag
    Usage Examples

    These examples are based on the Real time Quick Start.

    event_id
    location
    hash

    282776561

    80406178a3d70a3d714041d5c28f5c28f6

    92a8b787e81672261aad8afcf9de3aee

    circle-info

    The row returned will be different if you run this example as the data is ingested in real-time.

    select event_id, location, MD5(location) AS hash
    from meetupRsvp 
    limit 1
    hashtag
    Usage Examples
    millisecond

    0

    millisecond

    0

    select millisecond(1639351800000) AS millisecond
    FROM ignoreMe
    select millisecond(1639351800000, 'America/St_Johns') AS millisecond
    FROM ignoreMe
    hashtag
    Usage Examples

    These examples are based on the Batch Quick Start.

    value

    1871

    select min(yearID) AS value
    from baseballStats 
    hashtag
    Usage Examples

    These examples are based on the Batch Quick Start.

    value

    142

    select MINMAXRANGE(yearID) AS value
    from baseballStats 
    hashtag
    Usage Examples
    now

    1639150454255

    This function is typically used in predicate to filter on timestamp for recent data. e.g. filter data on recent 1 day(86400 seconds)

    select now() AS now
    FROM ignoreMe
    SELECT * 
    FROM tableName
    WHERE tsInMillis > now() - 86400000
    hashtag
    Usage Examples

    These examples are based on the Hybrid Quick Start.

    value

    2

    select MINMV(DivLongestGTimes) AS value
    from airlineStats 
    where arraylength(DivLongestGTimes) > 1
    select DivWheelsOffs, 
           arrayConcatInt(DivWheelsOffs, DivWheelsOns) AS concatIds
    from airlineStats 
    WHERE arraylength(DivWheelsOffs) >= 2
    limit 5
    select DivTailNums, 
           arrayConcatString(DivTailNums, DivTailNums) AS concatIds
    from airlineStats 
    WHERE arraylength(DivTailNums) >= 2
    limit 5
    select DivAirportIDs, 
           arrayDistinctInt(DivAirportIDs) AS unique
    from airlineStats 
    WHERE arraylength(DivAirportIDs) >= 2
    limit 5
    select DivTailNums, 
           arrayDistinctString(DivTailNums) AS unique
    from airlineStats 
    WHERE arraylength(DivTailNums) >= 2
    limit 5
    select DivAirportIDs, 
           arrayIndexOfInt(DivAirportIDs, 14683) AS index
    from airlineStats 
    WHERE arraylength(DivAirportIDs) >= 2
    limit 5
    select ARRAYLENGTH(RandomAirports) AS length, count(*) 
    from airlineStats 
    GROUP BY length
    ORDER BY count(*) DESC
    LIMIT 5
    select RandomAirports, 
           arrayRemoveString(RandomAirports, 'SEA') AS value
    from airlineStats 
    WHERE arraylength(RandomAirports) BETWEEN 2 AND 4
    limit 5
    select day(1639351800000, 'CET') AS day
    FROM ignoreMe
    select dayOfMonth(1639351800000) AS day
    FROM ignoreMe
    select dayOfMonth(1639351800000, 'CET') AS day
    FROM ignoreMe
    select dayOfWeek(1639351800000, 'CET') AS dayOfWeek
    FROM ignoreMe
    select dow(1639351800000) AS dayOfWeek
    FROM ignoreMe
    select dow(1639351800000, 'CET') AS dayOfWeek
    FROM ignoreMe
    select DISTINCT league AS value
    from baseballStats 
    select DISTINCT(league) AS value
    from baseballStats 
    select DISTINCTCOUNTBITMAP(teamID) AS value
    from baseballStats 
    select DISTINCTCOUNTBITMAPMV(DivTailNums) AS value
    from airlineStats 
    where arraylength(DivTailNums) > 1
    select DISTINCTCOUNTRAWHLL(teamID, 1) AS value
    from baseballStats 
    select DISTINCTCOUNTMV(DivTailNums) AS value
    from airlineStats 
    where arraylength(DivTailNums) > 1
    select DISTINCTCOUNTRAWHLLMV(DivAirports, 1) AS value
    from airlineStats 
    where arraylength(DivAirports) > 1
    select FromEpochSeconds(1613472303) AS epochMillis
    FROM ignoreMe
    select FromEpochMinutes(26891205) AS epochMillis
    FROM ignoreMe
    select FromEpochHours(448186) AS epochMillis
    FROM ignoreMe
    select FromEpochDays(18674) AS epochMillis
    FROM ignoreMe
    SELECT count(*) 
    FROM myTable 
    WHERE isSubnetOf('192.168.0.1/24', ipAddressCol);
    
    SELECT count(*) 
    FROM myTable 
    WHERE isSubnetOf('192.168.0.1/24', ipAddressCol) 
    OR isSubnetOf(ipPrefixCol, '7890:db8:113::8a2e:370:7336');
    SELECT 
        CASE 
            WHEN isSubnetOf('105.25.245.115/27', srcIPAddress) THEN 'case1' 
            WHEN isSubnetOf('105.25.245.115/27', dstIPAddress) THEN 'case2'
            ELSE 'case3' 
        END AS differentFlow
    FROM myTable;
    hashtag
    Usage Examples

    These examples are based on the Hybrid Quick Start.

    FlightNum
    reversedAirports
    RandomAirports

    1206

    PSC,SEA

    SEA,PSC

    5300

    PSC,SEA

    SEA,PSC

    3359

    MSY,PHX,PSC,SEA

    SEA,PSC,PHX,MSY

    hashtag
    Usage Examples

    These examples are based on the Hybrid Quick Start.

    FlightNum
    airports
    RandomAirports

    671

    SEA,PSC

    SEA,PSC,PHX,MSY

    1767

    SEA,PSC

    SEA,PSC,PHX

    2522

    SEA,PSC

    SEA,PSC

    hashtag
    Usage Examples

    These examples are based on the Hybrid Quick Start.

    DivTailNums
    DivAirports
    unionIds

    N7713A,N7713A

    IND,IND

    N7713A,IND

    N344AA,N344AA

    MCI,BOS

    N344AA,MCI,BOS

    N7713A,N7713A

    IND,IND

    N7713A,IND

    arrayContainsInt

    This section contains reference documentation for the arrayContainsInt function.

    Checks if int value exists in array.

    hashtag
    Signature

    arrayContainsInt('colName', valueToFind)

    hashtag
    Usage Examples

    These examples are based on the .

    DivAirportIDs
    containsValue

    arrayIndexOfString

    This section contains reference documentation for the arrayIndexOfString function.

    Finds the last index of the given value in the array starting at the given index.

    hashtag
    Signature

    arrayIndexOfString('colName', valueToFind)

    hashtag
    Usage Examples

    These examples are based on the .

    DivTailNums
    index

    arrayRemoveInt

    This section contains reference documentation for the arrayRemoveInt function.

    Removes value from array of ints.

    hashtag
    Signature

    arrayRemoveInt('colName', value)

    hashtag
    Usage Examples

    These examples are based on the .

    DivAirportIDs
    value

    arrayReverseInt

    This section contains reference documentation for the arrayReverseInt function.

    Reverses array of ints.

    hashtag
    Signature

    arrayReverseInt('colName')

    hashtag
    Usage Examples

    These examples are based on the .

    DivAirportIDs
    reversedIds

    arraySortInt

    This section contains reference documentation for the arraySortInt function.

    Sorts array of ints.

    hashtag
    Signature

    arraySortInt('colName')

    hashtag
    Usage Examples

    These examples are based on the .

    DivAirportIDs
    sortedIds

    arraySortString

    This section contains reference documentation for the arraySortString function.

    Sorts array of strings.

    hashtag
    Signature

    arraySortString('colName')

    hashtag
    Usage Examples

    These examples are based on the .

    FlightNum
    sortedAirports
    RandomAirports

    CHR

    This section contains reference documentation for the CHR function.

    the character corresponding to the Unicode codepoint

    hashtag
    Signature

    CHR(codepoint)

    hashtag
    Usage Examples

    value

    concat

    This section contains reference documentation for the concat function.

    Concatenate two input strings using the seperator

    hashtag
    Signature

    CONCAT(col1, col2, seperator)

    hashtag
    Usage Examples

    value
    value

    COUNTMV

    This section contains reference documentation for the COUNTMV function.

    Get the count of rows in a group

    hashtag
    Signature

    COUNTMV(colName)

    hashtag
    Usage Examples

    These examples are based on the .

    The following query returns the documents that have a DivTailNums with more than one value:

    DivTailNums

    You can count the number of items in these rows by running the following query:

    value

    COVAR_POP

    This section contains reference documentation for the COVAR_POP function.

    Returns the population covariance between of 2 numerical columns.

    COVAR_POP(col1, col2) = E[col1 * col2] - E[col1]E[col2]

    hashtag
    Signatures

    COVAR_POP(col1, col2) -> double

    hashtag
    Usage Examples

    These examples are based on the .

    covariance

    dayOfYear

    This section contains reference documentation for the dayOfYear function.

    Returns the day of the year from the given epoch millis in UTC or specified timezone. The value ranges from 1 to 366.

    hashtag
    Signature

    dayOfYear(tsInMillis)

    dayOfYear(tsInMillis, timeZoneId)

    doy(tsInMillis)

    doy(tsInMillis, timeZoneId)

    hashtag
    Usage Examples

    dayOfYear
    dayOfYear
    dayOfYear
    dayOfYear

    DISTINCTCOUNTHLLMV

    This section contains reference documentation for the DISTINCTCOUNTHLLMV function.

    Returns an approximate distinct count using HyperLogLog in a group

    hashtag
    Signature

    DISTINCTCOUNTHLLMV(colName)

    hashtag
    Usage Examples

    These examples are based on the .

    value

    DISTINCTCOUNTHLL

    This section contains reference documentation for the DISTINCTCOUNTHLL function.

    Returns an approximate distinct count using HyperLogLog. It also takes an optional second argument to configure the log2m for the HyperLogLog. For accurate distinct counting, see DISTINCTCOUNT.

    hashtag
    Signature

    DISTINCTCOUNTHLL(colName, log2m)

    hashtag
    Usage Examples

    These examples are based on the .

    value
    value

    FromEpochBucket

    This section contains reference documentation for the fromEpochBucket functions.

    Convert epoch to epoch milliseconds. e.g. 10 seconds since epoch or 5 minutes since Epoch. The following time units are supported:

    • SECONDS

    • MINUTES

    • HOURS

    • DAYS

    hashtag
    Signature

    FromEpoch<TIME_UNIT>Bucket(timeInMillis, bucketSize)

    hashtag
    Usage Examples

    bucket
    bucket
    bucket
    bucket
    bucket

    Histogram

    This section contains reference documentation for the HISTOGRAM function.

    Returns the count of data points that fall within each bin as a vector. The bins are left-inclusive and right-exclusive, i.e. [a, b), except for the last one which is inclusive on both sides [a, b].

    hashtag
    Signatures

    1. Equal length bins (better performance):

    HISTOGRAM(colName, lower, upper, numBins)

    1. Arbitrary increasing bin edges:

    HISTOGRAM(colName, ARRAY[binEdge1, binEdge2, binEdge3, ...])

    hashtag
    Usage Examples

    These examples are based on the .

    1. 10 equal-length bins [0, 20), [20, 30) ... [180, 200]

    histogram
    1. 6 bins (- ∞, 1), [1, 10), [10, 50), [50,100), [100,500), [500, 1000]

    histogram

    hour

    This section contains reference documentation for the hour function.

    Returns the hour of the day from the given epoch millis in UTC or specified timezone. The value ranges from 0 to 23.

    hashtag
    Signature

    hour(tsInMillis)

    hour(tsInMillis, timeZoneId)

    hashtag
    Usage Examples

    hour
    hour

    JSONFORMAT

    This section contains reference documentation for the JSONFORMAT function.

    Extracts the object value from jsonField based on 'jsonPath', the result type is inferred based on JSON value. This function can only be used in an ingestion transformation function.

    hashtag
    Signature

    JSONFORMAT(object)

    hashtag
    Usage Examples

    The usage examples are based on extracting fields from the following JSON document:

    Expression
    Value

    This function can be used in the to extract the meta property into the data column, as described below:

    jsonextractkey

    This section contains reference documentation for the JSONEXTRACTKEY function.

    Extracts all matched JSON field keys based on 'jsonPath' into a STRING_ARRAY.

    hashtag
    Signature

    JSONEXTRACTKEY(jsonField, 'jsonPath')

    Arguments
    Description
    circle-exclamation

    'jsonPath'` is a literal. Pinot uses single quotes to distinguish them from identifiers.

    hashtag
    Usage Examples

    The examples in this section are based on the . In particular we'll be querying the row WHERE id = 7044874109.

    id
    repo
    keys

    ln

    This section contains reference documentation for the ln function.

    Natural log of value i.e. ln(col1)

    hashtag
    Signature

    LN(col1)

    hashtag
    Usage Examples

    value
    value

    ltrim

    This section contains reference documentation for the ltrim function.

    trim spaces from left side of the string

    hashtag
    Signature

    LTRIM(col)

    hashtag
    Usage Examples

    notTrimmed
    trimmed

    MOD

    This section contains reference documentation for the MOD function.

    Modulo of two values

    hashtag
    Signature

    MOD(col1, col2)

    hashtag
    Usage Examples

    value
    value

    MINMAXRANGEMV

    This section contains reference documentation for the MINMAXRANGEMV function.

    Returns the max - min value in a group

    hashtag
    Signature

    MINMAXRANGEMV(colName)

    hashtag
    Usage Examples

    These examples are based on the .

    value

    mult

    This section contains reference documentation for the MULT function.

    Product of at least two values

    hashtag
    Signature

    MULT(col1, col2, col3...)

    hashtag
    Usage Examples

    These examples are based on the .

    homeRuns
    baseOnBalls
    total

    percentileest

    This section contains reference documentation for the percentileest function.

    Returns the Nth percentile of the group using Quantile Digestarrow-up-right algorithm.

    hashtag
    Signature

    percentileest(colName, percentile)

    hashtag
    Usage Examples

    These examples are based on the .

    value
    value
    value

    percentile

    This section contains reference documentation for the percentile function.

    Returns the max - min value in a group

    hashtag
    Signature

    percentile(colName, percentile)

    hashtag
    Usage Examples

    These examples are based on the .

    value
    value
    value

    remove

    This section contains reference documentation for the remove function.

    Removes all instances of search from string

    hashtag
    Signature

    remove(input, search)

    hashtag
    Usage Examples

    value

    reverse

    This section contains reference documentation for the reverse function.

    Reverse the string

    hashtag
    Signature

    REVERSE(col)

    hashtag
    Usage Examples

    name

    arraySliceInt

    This section contains reference documentation for the arraySliceInt function.

    Returns the values in the array between the start and end positions.

    hashtag
    Signature

    arraySliceInt('colName', start, end)

    arrayUnionInt

    This section contains reference documentation for the arrayUnionInt function.

    Create a union of two arrays of ints.

    hashtag
    Signature

    arrayUnionInt('colName1', 'colName2')

    Base64

    This section contains reference documentation for base64 encode and decode functions.

    Encoding scheme follows

    • toBase64 returns Base64 encoded string of input binary data (bytes type).

    • fromBase64

    FromDateTime

    This section contains reference documentation for the FromDateTime function.

    Converts a formatted date-time string to milliseconds, based on the provided .

    hashtag
    Signature

    FromDateTime(dateTimeString, pattern)

    minute

    This section contains reference documentation for the minute function.

    Returns the minute of the hour from the given epoch millis in UTC or specified timezone. The value ranges from 0 to 59.

    hashtag
    Signature

    minute(tsInMillis)

    minute(tsInMillis, timeZoneId)

    month

    This section contains reference documentation for the month function.

    Returns the month of the year from the given epoch millis in UTC or specified timezone. The value ranges from 1 to 12.

    hashtag
    Signature

    month(tsInMillis)

    month(tsInMillis, timeZoneId)

    percentileestmv

    This section contains reference documentation for the PERCENTILEESTMV function.

    Returns the Nth percentile of the group using algorithm.

    hashtag
    Signature

    PERCENTILEESTMV(colName, N)

    percentilemv

    This section contains reference documentation for the PERCENTILEMV function.

    Returns the Nth percentile of the group where N is a decimal number between 0 and 100 inclusive

    hashtag
    Signature

    PERCENTILEMV(colName, N)

    percentiletdigest

    This section contains reference documentation for the PERCENTILETDigest function.

    Returns the Nth percentile of the group using .

    hashtag
    Signature

    PERCENTILETDigest(colName, percentile)

    percentiletdigestmv

    This section contains reference documentation for the PERCENTILETDIGESTMV function.

    Returns the Nth percentile of the group using .

    hashtag
    Signature

    PERCENTILETDIGESTMV(colName, N)

    regexpExtract

    This section contains reference documentation for the regexpExtract function.

    Extracts values that match the provided regular expression

    hashtag
    Signature

    regexpExtract(value, regexp)

    regexpExtract(value, regexp, group)

    select FlightNum, 
           arrayReverseString(RandomAirports) AS reversedAirports, 
           RandomAirports
    from airlineStats 
    WHERE arraylength(RandomAirports) BETWEEN 2 AND 4
    limit 5
    select FlightNum, 
           arraySliceString(RandomAirports, 0, 2) AS airports, 
           RandomAirports
    from airlineStats 
    WHERE arraylength(RandomAirports) BETWEEN 2 AND 4
    limit 5
    select DivTailNums, 
           DivAirports,
           arrayUnionString(DivTailNums, DivAirports) AS unionIds
    from airlineStats 
    WHERE arraylength(DivTailNums) >= 2
    limit 5

    1023

    PHX,PSC,SEA

    SEA,PSC,PHX

    963

    MSY,PHX,PSC,SEA

    SEA,PSC,PHX,MSY

    424

    SEA,PSC

    SEA,PSC,PHX,MSY

    3162

    SEA,PSC

    SEA,PSC,PHX,MSY

    N344AA,N344AA

    MCI,BOS

    N344AA,MCI,BOS

    13891,12892

    false

    14683,14683

    true

    12339,12339

    false

    13487,13930

    false

    13029,11292

    false

    Hybrid Quick Start
    select DivTailNums, 
           arrayIndexOfString(DivTailNums, 'N7713A') AS index
    from airlineStats 
    WHERE arraylength(DivTailNums) >= 2
    limit 5

    N7713A,N7713A

    0

    N344AA,N344AA

    -1

    N7713A,N7713A

    0

    Hybrid Quick Start

    13891,12892

    13891

    13198,12892

    13198

    11066,12892

    11066

    13198,12892

    13198

    13891,12892

    13891

    Hybrid Quick Start

    13891,12892

    12892,13891

    14683,14683

    14683,14683

    12339,12339

    12339,12339

    13487,13930

    13930,13487

    13029,11292

    11292,13029

    Hybrid Quick Start

    13891,12892

    12892,13891

    14683,14683

    14683,14683

    12339,12339

    12339,12339

    13198,10721

    10721,13198

    10721,12478

    10721,12478

    Hybrid Quick Start

    PSC,SEA

    SEA,PSC

    5300

    PSC,SEA

    SEA,PSC

    3846

    PSC,SEA

    SEA,PSC

    3635

    MSY,PHX,PSC,SEA

    SEA,PSC,PHX,MSY

    429

    MSY,PHX,PSC,SEA

    SEA,PSC,PHX,MSY

    Hybrid Quick Start

    1206

    SELECT CHR(65) AS value
    FROM ignoreMe

    A

    SELECT concat('Apache', 'Pinot', ' ') AS value
    FROM ignoreMe

    Apache Pinot

    SELECT concat('real-time', 'analytics', '__') AS value
    FROM ignoreMe

    real-time__analytics

    select DivTailNums
    from airlineStats 
    where arraylength(DivTailNums) > 1

    N7713A,N7713A

    N344AA,N344AA

    N344AA,N344AA

    N7713A,N7713A

    8

    Hybrid Quick Start
    SELECT COVAR_POP(numberOfGames, hits) AS covariance 
    FROM baseballStats

    2314.249154477403

    Batch Quick Start
    select dayOfYear(1639351800000) AS dayOfYear
    FROM ignoreMe

    346

    347

    346

    347

    select DISTINCTCOUNTHLLMV(DivLongestGTimes) AS value
    from airlineStats 
    where arraylength(DivLongestGTimes) > 1

    34

    Hybrid Quick Start
    select DISTINCTCOUNTHLL(teamID) AS value
    from baseballStats 

    158

    149

    Batch Quick Start

    1613472303000

    3226944606000

    1613472000000

    1613466000000

    1613088000000

    32348,21519,11359,7587,5488,5360,6282,7361,585,0

    13520,16506,18375,12403,28591,8494

    Batch Quick Start
    select hour(1639351800000) AS hour
    FROM ignoreMe

    23

    select hour(1639351800000, 'CET') AS hour
    FROM ignoreMe

    0

    {"timestamp": "2019-10-09 21:25:25", "meta": {"age": 12}}

    JSONFORMAT(meta)

    "{\"age\":12}"

    table config

    jsonField

    An Identifier/Expression contains JSON documents.

    'jsonPath'

    Follows JsonPath Syntaxarrow-up-right to read values from JSON documents.

    7044874109

    {"id":115911530,"name":"LimeVista/Tapes","url":"https://api.github.com/repos/LimeVista/Tapes"}

    $['id'],$['name'],$['url']

    Batch JSON Quick Start
    select ln(1) AS value
    from ignoreMe

    0

    select ln(12) AS value
    from ignoreMe

    2.4849066497880004

    SELECT ' Pinot with spaces  ' AS notTrimmed,
           ltrim(' Pinot with spaces ') AS trimmed
    FROM ignoreMe

    " Pinot with spaces "

    "Pinot with spaces "

    select MOD(12, 5) AS value
    from ignoreMe

    2

    select MOD(12, 2) AS value
    from ignoreMe

    0

    select MINMAXRANGEMV(DivLongestGTimes) AS value
    from airlineStats 
    where arraylength(DivLongestGTimes) > 1

    106

    Hybrid Quick Start
    select homeRuns, baseOnBalls, MULT(homeRuns, baseOnBalls) AS total
    from baseballStats 
    WHERE teamID = 'ML1' 
    AND yearID = 1956 
    AND playerName = 'Henry Louis'

    26

    37

    962

    Batch Quick Start
    select percentileest(homeRuns, 50) AS value
    from baseballStats 

    0

    4

    46

    Batch Quick Start
    select percentile(homeRuns, 50) AS value
    from baseballStats 

    0

    4

    46

    Batch Quick Start
    select remove('foo bar foo sheep', 'foo') AS value
    from ignoreMe

    bar sheep

    select REVERSE('Pinot') AS name
    FROM ignoreMe

    toniP

    returns binary data (represented as a Hex string) from Base64-encoded string.

    hashtag
    Signature

    toBase64(bytesCol)

    fromBase64(stringCol)

    hashtag
    Usage Examples

    circle-info

    For better readability, the following examples converts string hello! into BYTES using toUtf8arrow-up-right function and converts the decoded BYTES into string using fromUtf8arrow-up-right.

    encoded

    aGVsbG8h

    decoded

    hello!

    circle-info

    Note that without UTF8 string conversion, returned BYTES will be represented as a Hex string following Pinot's BYTES column representationarrow-up-right. See the example below.

    decoded

    68656c6c6f21

    circle-exclamation

    Note that the following query will throw compilation error as string is not a valid input type for toBase64.

    java.util.Base64.Encoderarrow-up-right
    hashtag
    Usage Examples
    epochMillis

    1565136000000

    epochMillis

    1565190733000

    epochMillis

    1565190733000

    epochMillis

    1565190733000

    Joda-Time patternarrow-up-right
    hashtag
    Usage Examples
    minute

    30

    minute

    0

    select minute(1639351800000) AS minute
    FROM ignoreMe
    select minute(1639351800000, 'America/St_Johns') AS minute
    FROM ignoreMe
    hashtag
    Usage Examples
    month

    9

    month

    10

    select month(1633046399000, 'UTC') AS month
    FROM ignoreMe
    select month(1633046399000, 'CET') AS month
    FROM ignoreMe
    hashtag
    Usage Examples

    These examples are based on the Hybrid Quick Start.

    value

    10

    value

    44

    value

    108

    Quantile Digestarrow-up-right
    select PERCENTILEESTMV(DivLongestGTimes, 50) AS value
    from airlineStats 
    where arraylength(DivLongestGTimes) > 1
    hashtag
    Usage Examples

    These examples are based on the Hybrid Quick Start.

    value

    10

    value

    44

    value

    108

    select PERCENTILEMV(DivLongestGTimes, 50) AS value
    from airlineStats 
    where arraylength(DivLongestGTimes) > 1
    hashtag
    Usage Examples

    These examples are based on the Batch Quick Start.

    value

    0

    value

    3.6571905392487856

    value

    46.26787306220119

    T-digest algorithmarrow-up-right
    select PERCENTILETDigest(homeRuns, 50) AS value
    from baseballStats 
    hashtag
    Usage Examples

    These examples are based on the Hybrid Quick Start.

    value

    10

    value

    44

    value

    108

    T-digest algorithmarrow-up-right
    select PERCENTILETDIGESTMV(DivLongestGTimes, 50) AS value
    from airlineStats 
    where arraylength(DivLongestGTimes) > 1
    regexpExtract(value, regexp, group, defaultValue)

    hashtag
    Usage Examples

    value

    foo

    value

    123

    value

    foo

    value

    foo bar

    value

    bar

    value

    defaultValue

    select DivAirportIDs, 
           arrayContainsInt(DivAirportIDs, 14683) AS containsValue
    from airlineStats 
    WHERE arraylength(DivAirportIDs) >= 2
    limit 5
    select DivAirportIDs, 
           arrayRemoveInt(DivAirportIDs, 12892) AS value
    from airlineStats 
    WHERE arraylength(DivAirportIDs) >= 2
    AND arrayContainsInt(DivAirportIDs, 12892) = 1
    limit 5
    select DivAirportIDs, 
           arrayReverseInt(DivAirportIDs) AS reversedIds
    from airlineStats 
    WHERE arraylength(DivAirportIDs) >= 2
    limit 5
    select DivAirportIDs, 
           arraySortInt(DivAirportIDs) AS sortedIds
    from airlineStats 
    WHERE arraylength(DivAirportIDs) >= 2
    limit 5
    select FlightNum, 
           arraySortString(RandomAirports) AS sortedAirports, 
           RandomAirports
    from airlineStats 
    WHERE arraylength(RandomAirports) BETWEEN 2 AND 4
    limit 5
    select COUNTMV(DivTailNums) AS value
    from airlineStats 
    where arraylength(DivTailNums) > 1
    select dayOfYear(1639351800000, 'CET') AS dayOfYear
    FROM ignoreMe
    select doy(1639351800000) AS dayOfYear
    FROM ignoreMe
    select doy(1639351800000, 'CET') AS dayOfYear
    FROM ignoreMe
    select DISTINCTCOUNTHLL(teamID, 12) AS value
    from baseballStats 
    select FromEpochSecondsBucket(1613472303, 1) AS bucket
    FROM ignoreMe
    select FromEpochSecondsBucket(1613472303, 2) AS bucket
    FROM ignoreMe
    select FromEpochMinutesBucket(2689120, 10) AS bucket
    FROM ignoreMe
    select FromEpochHoursBucket(89637, 5) AS bucket
    FROM ignoreMe
    select FromEpochDaysBucket(1867, 10) AS bucket
    FROM ignoreMe
    SELECT HISTOGRAM(numberOfGames, 0, 200, 10) AS histogram
    FROM baseballStats 
    select HISTOGRAM(AtBatting, Array['-Infinity', 1, 10, 50, 100, 500, 1000]) AS histogram
    from baseballStats
    {
       "tableConfig":{
          "ingestionConfig":{
             "transformConfigs":[
                {
                   "columnName":"data",
                   "transformFunction":"JSONFORMAT(meta)"
                }
             ]
          }
       }
    }
    select id, repo, JSONEXTRACTKEY(repo, '$.*') AS keys
    from githubEvents 
    WHERE id = 7044874109
    select percentileest(homeRuns, 80) AS value
    from baseballStats 
    select percentileest(homeRuns, 99.9) AS value
    from baseballStats 
    select percentile(homeRuns, 80) AS value
    from baseballStats 
    select percentile(homeRuns, 99.9) AS value
    from baseballStats 
    SELECT toBase64(toUtf8('hello!')) AS encoded
    FROM ignoreMe
    SELECT fromUtf8(fromBase64('aGVsbG8h')) AS decoded
    FROM ignoreMe
    SELECT fromBase64('aGVsbG8h') AS decoded
    FROM ignoreMe
    SELECT toBase64('hello!') AS encoded
    FROM ignoreMe
    SELECT FromDateTime('2019-08-07', 'yyyy-MM-dd') AS epochMillis
    FROM ignoreMe
    SELECT FromDateTime(
        '2019-08-07 3:12:13 PM', 
        'yyyy-MM-dd hh:mm:ss a'
        ) AS epochMillis
    FROM ignoreMe
    SELECT FromDateTime(
        '2019-08-07T15:12:13', 
        'yyyy-MM-dd''T''HH:mm:ss'
        ) AS epochMillis
    FROM ignoreMe
    SELECT FromDateTime(
        '2019-08-07T07:12:13-0800', 
        'yyyy-MM-dd''T''HH:mm:ssZ'
        ) AS epochMillis
    FROM ignoreMe
    select PERCENTILEESTMV(DivLongestGTimes, 90) AS value
    from airlineStats 
    where arraylength(DivLongestGTimes) > 1
    select PERCENTILEESTMV(DivLongestGTimes, 99.9) AS value
    from airlineStats 
    where arraylength(DivLongestGTimes) > 1
    select PERCENTILEMV(DivLongestGTimes, 90) AS value
    from airlineStats 
    where arraylength(DivLongestGTimes) > 1
    select PERCENTILEMV(DivLongestGTimes, 99.9) AS value
    from airlineStats 
    where arraylength(DivLongestGTimes) > 1
    select PERCENTILETDigest(homeRuns, 80) AS value
    from baseballStats 
    select PERCENTILETDigest(homeRuns, 99.9) AS value
    from baseballStats 
    select PERCENTILETDIGESTMV(DivLongestGTimes, 90) AS value
    from airlineStats 
    where arraylength(DivLongestGTimes) > 1
    select PERCENTILETDIGESTMV(DivLongestGTimes, 99.9) AS value
    from airlineStats 
    where arraylength(DivLongestGTimes) > 1
    select regexpExtract('foo', '.*') AS value
    from ignoreMe
    select regexpExtract('foo123', '[0-9]+') AS value
    from ignoreMe
    select regexpExtract('foo123', '[^0-9]+') AS value
    from ignoreMe
    select regexpExtract('foo bar baz', '(\w+) (\w+)', 0) AS value
    from ignoreMe
    select regexpExtract('foo bar baz', '(\w+) (\w+)', 2) AS value
    from ignoreMe
    select regexpExtract('foo123', 'bar', 0, 'defaultValue') AS value
    from ignoreMe
    hashtag
    Usage Examples

    These examples are based on the Hybrid Quick Start.

    FlightNum
    airports
    DivAirportIDs

    1531

    13891

    13891,12892

    19

    14683

    14683,14683

    829

    12339

    12339,12339

    hashtag
    Usage Examples

    These examples are based on the Hybrid Quick Start.

    DivWheelsOffs
    DivWheelsOns
    unionIds

    1453,1731

    1415,1623

    1453,1731,1415,1623

    1908,1758

    1339,2310

    1908,1758,1339,2310

    1453,1731

    1415,1623

    1453,1731,1415,1623

    DISTINCTCOUNTTHETASKETCH

    This section contains reference documentation for the DISTINCTCOUNTTHETASKETCH function.

    The Theta Sketcharrow-up-right framework enables set operations over a stream of data, and can also be used for cardinality estimation. Pinot leverages the Sketch Classarrow-up-right and its extensions from the library org.apache.datasketches:datasketches-java:1.2.0-incubating to perform distinct counting as well as evaluating set operations.

    hashtag
    Signature

    DistinctCountThetaSketch(<thetaSketchColumn>, <thetaSketchParams>, predicate1, predicate2..., postAggregationExpressionToEvaluate) -> Long

    • thetaSketchColumn (required): Name of the column to aggregate on.

    • thetaSketchParams (required): Parameters for constructing the intermediate theta-sketches.

    hashtag
    Usage Examples

    These examples are based on the .

    value
    value

    We can also provide predicates and a post aggregation expression to compute more complicated cardinalities. For example, we could can find the intersection of the following queries:

    yearID
    yearID

    (the yearId 1986 is the only one in common)

    By running the following query:

    value

    DATETRUNC

    This section contains reference documentation for the DATETRUNC function.

    (Presto) SQL compatible date truncation, equivalent to the Presto function date_trunc.

    Converts the value into a specified output granularity seconds since UTC epoch that is bucketed on a unit in a specified timezone.

    hashtag
    Signature

    DATETRUNC(unit, timeValue)

    DATETRUNC(unit, timeValue, inputTimeUnitStr)

    DATETRUNC(unit, timeValue, inputTimeUnitStr, timeZone)

    DATETRUNC(unit, timeValue, inputTimeUnitStr, timeZone, outputTimeUnitStr)

    unit supports the following values:

    • millisecond

    • second

    • minute

    inputTimeUnitStr and outputTimeUnitStr support the following values:

    • NANOSECONDS

    • MICROSECONDS

    • MILLISECONDS

    hashtag
    Usage Examples

    Truncates an epoch in milliseconds at WEEK (where a Week starts at Monday UTC midnight):

    or

    ts

    Truncates an epoch in milliseconds at WEEK (where a Week starts at Monday UTC midnight) in the UTC time zone, returning a result in epoch in seconds in UTC timezone:

    ts

    Truncates an epoch in milliseconds at WEEK (where a Week starts at Monday UTC midnight) in the CET time zone, returning a result in epoch in seconds in UTC timezone:

    ts

    Truncates an epoch in milliseconds at QUARTER in the Los Angeles time zone (where a Quarter begins on Jan 1st, April 1st, July 1st, October 1st in Los Angeles timezone), returning a result in hours since UTC epoch:

    ts

    JSONPATHARRAY

    This section contains reference documentation for the JSONPATHARRAY function.

    Extracts an array from jsonField based on 'jsonPath', the result type is inferred based on JSON value. This function can only be used in an ingestion transformation function.

    hashtag
    Signature

    JSONPATHARRAY(jsonField, 'jsonPath')

    Arguments
    Description
    circle-exclamation

    'jsonPath'` is a literal. Pinot uses single quotes to distinguish them from identifiers. You can use the to test JSON expressions before you import any data.

    hashtag
    Usage Examples

    The usage examples are based on extracting fields from the following JSON document:

    Expression
    Value

    This function can be used in the to extract the name, score, and second value of homework_grades into their respective columns , as described below:

    JSONPATHDOUBLE

    This section contains reference documentation for the JSONPATHDOUBLE function.

    Extracts the Double value from jsonField based on 'jsonPath', use optional defaultValuefor null or parsing error. This function can only be used in an ingestion transformation function.

    hashtag
    Signature

    JSONPATHDOUBLE(jsonField, 'jsonPath', [defaultValue])

    Arguments
    Description
    circle-exclamation

    'jsonPath'` is a literal. Pinot uses single quotes to distinguish them from identifiers. You can use the to test JSON expressions before you import any data.

    hashtag
    Usage Examples

    The usage examples are based on extracting fields from the following JSON document:

    Expression
    Value

    This function can be used in the to extract the age property into the age column, as described below:

    JSONPATHLONG

    This section contains reference documentation for the JSONPATHLONG function.

    Extracts the Long value from jsonField based on 'jsonPath', use optional defaultValuefor null or parsing error. This function can only be used in an ingestion transformation function.

    hashtag
    Signature

    JSONPATHLONG(jsonField, 'jsonPath', [defaultValue])

    Arguments
    Description
    circle-exclamation

    'jsonPath'` is a literal. Pinot uses single quotes to distinguish them from identifiers. You can use the to test JSON expressions before you import any data.

    hashtag
    Usage Examples

    The usage examples are based on extracting fields from the following JSON document:

    Expression
    Value

    This function can be used in the to extract the age property into the age column, as described below:

    mode

    This section contains reference documentation for the mode function.

    Get the most frequent value in a group. When multiple modes are present it gives the minimum of all the modes. This behavior can be overridden to get the maximum or the average mode.

    hashtag
    Signature

    MODE(colName, [reducerType])

    hashtag
    Usage Examples

    These examples are based on the .

    value
    value
    value
    value

    quarter

    This section contains reference documentation for the quarter function.

    Returns the quarter of the year from the given epoch millis in UTC or specified timezone. The value ranges from 1 to 4

    hashtag
    Signature

    quarter(tsInMillis)

    quarter(tsInMillis, timeZoneId)

    hashtag
    Usage Examples

    quarter
    quarter
    quarter

    replace

    This section contains reference documentation for the REPLACE function.

    replace all instances of find with replace in input

    hashtag
    Signature

    REPLACE(col, find, replace)

    hashtag
    Usage Examples

    value
    value

    DISTINCTCOUNTRAWTHETASKETCH

    This section contains reference documentation for the DISTINCTCOUNTRAWTHETASKETCH function.

    The framework enables set operations over a stream of data, and can also be used for cardinality estimation. Pinot leverages the and its extensions from the library org.apache.datasketches:datasketches-java:1.2.0-incubating to perform distinct counting as well as evaluating set operations.

    hashtag
    Signature

    JSONPATHARRAYDEFAULTEMPTY

    This section contains reference documentation for the JSONPATHARRAYDEFAULTEMPTY function.

    Extracts an array from jsonField based on 'jsonPath', the result type is inferred based on JSON value. Returns empty array for null or parsing error. This function can only be used in an .

    hashtag
    Signature

    JSONPATHSTRING

    This section contains reference documentation for the JSONPATHSTRING function.

    Extracts the String value from jsonField based on 'jsonPath', use optional defaultValuefor null or parsing error. This function can only be used in an .

    hashtag
    Signature

    jsonextractscalar

    This section contains reference documentation for the JSONEXTRACTSCALAR function.

    Evaluates the 'jsonPath' on jsonField, returns the result as the type 'resultsType', use optional defaultValuefor null or parsing error.

    hashtag
    Signature

    JSONEXTRACTSCALAR(jsonField, 'jsonPath', 'resultsType', [defaultValue])

    select FlightNum, 
           arraySliceInt(DivAirportIDs, 0, 1) AS airports, 
    	     DivAirportIDs
    from airlineStats 
    WHERE arraylength(DivAirportIDs) >= 2
    limit 5
    select DivWheelsOffs, 
           DivWheelsOns,
           arrayUnionInt(DivWheelsOffs, DivWheelsOns) AS unionIds
    from airlineStats 
    WHERE arraylength(DivWheelsOffs) >= 2
    limit 5

    24

    13198

    13198,10721

    548

    10721

    10721,12478

    1908,1758

    1339,2310

    1908,1758,1339,2310

    Currently, the only supported parameter is nominalEntries (defaults to 4096).
  • predicates (optional)_: _ These are individual predicates of form lhs <op> rhs which are applied on rows selected by the where clause. During intermediate sketch aggregation, sketches from the thetaSketchColumn that satisfies these predicates are unionized individually. For example, all filtered rows that match country=USA are unionized into a single sketch. Complex predicates that are created by combining (AND/OR) of individual predicates is supported.

  • postAggregationExpressionToEvaluate (required): The set operation to perform on the individual intermediate sketches for each of the predicates. Currently supported operations are SET_DIFF, SET_UNION, SET_INTERSECT , where DIFF requires two arguments and the UNION/INTERSECT allow more than two arguments.

  • 149

    146

    1986

    1985

    1937

    2003

    1979

    1900

    1986

    1978

    2012

    1

    Batch Quick Start
    hour
  • day

  • week

  • month

  • quarter

  • year

  • SECONDS
  • MINUTES

  • HOURS

  • DAYS

  • 1639353600000

    1639353600

    1639350000

    453631

    jsonField

    An Identifier/Expression contains JSON documents.

    'jsonPath'

    Follows JsonPath Syntaxarrow-up-right to read values from JSON documents.

    JSONPATHARRAY(myJsonRecord, '$.subjects[*].name')

    ["maths", "english"]

    JSONPATHARRAY(myJsonRecord, '$.subjects[*].score')

    [90, 70]

    JSONPATHARRAY(myJsonRecord, '$.subjects[*].homework_grades[1]')

    [85, 65]

    Jayway JsonPath Evaluator Toolarrow-up-right
    table config

    jsonField

    An Identifier/Expression contains JSON documents.

    'jsonPath'

    Follows JsonPath Syntaxarrow-up-right to read values from JSON documents.

    JSONPATHDOUBLE(data, '$.age')

    24.0

    Jayway JsonPath Evaluator Toolarrow-up-right
    table config

    jsonField

    An Identifier/Expression contains JSON documents.

    'jsonPath'

    Follows JsonPath Syntaxarrow-up-right to read values from JSON documents.

    JSONPATHLONG(data, '$.age')

    24

    Jayway JsonPath Evaluator Toolarrow-up-right
    table config
    select mode(yearID) AS value
    from baseballStats 
    WHERE AtBatting != 0 AND yearID > 2001

    2008

    2010

    2008

    2012

    Batch Quick Start
    select quarter(1633046399000) AS quarter
    FROM ignoreMe

    3

    3

    4

    SELECT REPLACE('Hello, World', 'Hello', 'Goodbye') AS value
    FROM ignoreMe

    Goodbye, World

    SELECT REPLACE('Hello, World', 'Hellow', 'Goodbye') AS value
    FROM ignoreMe

    Hello, World

    DISTINCTCOUNTRAWTHETASKETCH(
    <thetaSketchColumn>, <thetaSketchParams>, predicate1, predicate2..., postAggregationExpressionToEvaluate
    ) -> HexEncoded
    • thetaSketchColumn (required): Name of the column to aggregate on.

    • thetaSketchParams (required): Parameters for constructing the intermediate theta-sketches.

      • Currently, the only supported parameter is nominalEntries (defaults to 4096).

    • predicates (optional)_: _ These are individual predicates of form lhs <op> rhs which are applied on rows selected by the where clause. During intermediate sketch aggregation, sketches from the thetaSketchColumn that satisfies these predicates are unionized individually. For example, all filtered rows that match country=USA are unionized into a single sketch. Complex predicates that are created by combining (AND/OR) of individual predicates is supported.

    • postAggregationExpressionToEvaluate (required): The set operation to perform on the individual intermediate sketches for each of the predicates. Currently supported operations are SET_DIFF, SET_UNION, SET_INTERSECT , where DIFF requires two arguments and the UNION/INTERSECT allow more than two arguments.

    hashtag
    Usage Examples

    These examples are based on the Batch Quick Start.

    value

    AgMDAAAKzJOVAAAAAACAPwDAATjfLK5fBJQy2rIU1GYLOK5a09G+XQ1UHWt00/NwFTC4EwzexhE3CHBSU+YIUzkM0goIADEeFViAmzCRcx5FeHrMHfGsU/qrFvMP+Q87UYRC7LFzZ0FV3PIfAF1FMFsM+E9XRwZRYoR79VdK7z1jAD/WClziDmb4Cosm3ctidcRl9VxfNTR47OUFqFP4dYQkZwXIEZtEhngdkGfqkQCKZPX85HITAZrwVDpI4TY6paDTZwLQNiemHFCUlEZCKcOMpkXuYypOxjzXi1ES+07IIH7EqrQeKcssHvOUh2gpzIDajYdQ4UTS6IBoXPB6AtbomPBiMalFURDzh+xppzrg5HcUTMW4Iuzgv5Mz/xIm73yOe7seghzwmH+zXUfda/mkaBqU6XQEAQFagTkndhYHHcjLb0OeQg4BGDAHtRIDD8EqsonkilQT6TZq2uM3CRXJQTlaYewzFvHsKivVomgcQRojVnPKBh0d0GgYeF4eIEXtD1bZTw43eVR1Dk6sBj3pjleOW21dRsUCRmyEDGdIfWQVJXouaUnZqaC9gi1oSrG7GT8HO2xXeb32OzfiHVx5s9+5bGpFXoXTu1n7g2Jone8JMyGuam2x7Bt55a1JdtFCFxhZ2Gd7IajHY4lNBH2lDfUoJed4f7kGUEXmlU6BCfwOkJ1CIoWBTQY+NToDhpmmmPY+rVOH5coybBHlH4vpfPBbbQsOjl0YBSC9uEmZ3WubqnV0KZ1p5d7wq/F0p7Wgo8y4JVXAobKCB+hsVckBNIA4XrYMzdWVSWeQsXHSuR+mWmJPftadyrMlfvoy2mVr8R4Dih7k3XNhXZwjBeuNJQA5Dtci6w0uIUczvEL+nY+9CSHEPQhuT//aluJ2De4Fk94cfWgaxqhYyh10TTIWZFmsDxJeOMaPT1BCwVRF6taOjftNbVDC5Fy1BtVzVIIUOGeBcj5VbhHtqowIB1qGEDIJy9ZBXD73iFBN5kVgvicaFGSKHGQqeIVsgOFdcFKITQTuV2d0pkljkPXKUIc68M0KPpU6iZYuaBA4+hGR9nri0tVnbJZOM1Z/fi01ou5YLYCoHTqkImozpJMYXLCqKtTBm2o7sc5oQATXUBC9dqM8xQoGL8OmltUWc1cX35rtD2D2zHL2IncEKMzsN/c6S31W74VTBtcbJfP9rHENp7yO453qYhA7m++jl2MKFzdvtkHqGDUcs9FKisV9Hx+ruhaGsLkdISszkZ3sYykjx3NH6BbbaCZf9jTswuxHKheTbaEDmSgrx7BfK+Z2My4jdMqCrEtKMSuJqEJ22AM5U8MNFVkCPTobkCEdJx0ZQJu+Tk73t1v3nqLUQH4PbFJzcUrr9yZFZ0u+1mzNNQ5o0w+v1dSRLGsXsPyRqGkQchuz/DKyrjJzf9Vb8HY4Ni63XiaXwgJrjq9rgAp6EmWV2xXUOI9CWZa7HsuRWO95m58nIq9K8VCkO+T/rWwrPqZ/tCgEtkshqecNhszQiki0d5Kf26o/YcATx4ZkJ655y4PTVr+kY0Xbb/UwEo2pPd3Hyd4hVz1I5N9TpYaJk2Lok1+7N+3LG+3Lj3KZtd5/+j8RujEmogI=

    value

    AwMDAAAKzJMQAAAAAACAP4vpfPBbbQsO5N1zYV2cIwWFgU0GPjU6A4Z4HZBn6pEAyQE0gDhetgyKZPX85HITAQ4BGDAHtRIDEDub76OXYwoxK4moQnbYA9LogGhc8HoCE+k2atrjNwlVbhHtqowIBzd5VHUOTqwG+aRoGpTpdAT6PxG6MSaiAnshqMdjiU0EHEEaI1ZzygY=

    We can also provide predicates and a post aggregation expression to compute more complicated cardinalities:

    value

    AQMDAAA6zJN8QPYIsvHMNQ==

    Theta Sketcharrow-up-right
    Sketch Classarrow-up-right
    select distinctCountThetaSketch(teamID) AS value
    from baseballStats 
    select distinctCountThetaSketch(teamID, 'nominalEntries=10') AS value
    from baseballStats
    select yearID
    from baseballStats
    where teamID = 'SFN' AND numberOfGames = 28 AND homeRuns = 1
    select yearID
    from baseballStats
    where teamID = 'CHN' AND numberOfGames = 28 AND homeRuns = 1
    select distinctCountThetaSketch(
      yearID, 
      'nominalEntries=4096', 
      'teamID = ''SFN'' AND numberOfGames=28 AND homeRuns=1',
      'teamID = ''CHN'' AND numberOfGames=28 AND homeRuns=1',
      'SET_INTERSECT($1, $2)'
    ) AS value
    from baseballStats 
    select dateTrunc('week', 1639480981746) AS ts
    FROM ignoreMe
    select dateTrunc('week', 1639480981746, 'MILLISECONDS') AS ts
    FROM ignoreMe
    select dateTrunc(
      'week', 
      1639480981746, 
      'MILLISECONDS', 
      'UTC', 
      'SECONDS'
    ) AS ts
    FROM ignoreMe
    select dateTrunc(
      'week', 
      1639480981746, 
      'MILLISECONDS', 
      'CET', 
      'SECONDS'
    ) AS ts
    FROM ignoreMe
    select dateTrunc(
      'quarter', 
      1639480981746, 
      'MILLISECONDS', 
      'America/Los_Angeles', 
      'HOURS'
    ) AS ts
    FROM ignoreMe
    {
      "data": {
        "name": "Pete",
        "age": 24,
        "subjects": [
          {
            "name": "maths",
            "homework_grades": [80, 85, 90, 95, 100],
            "grade": "A",
            "score": 90
          },
          {
            "name": "english",
            "homework_grades": [60, 65, 70, 85, 90],
            "grade": "B",
            "score": 70
          }
        ]
      }
    }
    {
       "tableConfig":{
          "ingestionConfig":{
             "transformConfigs":[
                {
                   "columnName":"names",
                   "transformFunction":"JSONPATHARRAY(data, '$.subjects[*].name')"
                },
                {
                   "columnName":"ages",
                   "transformFunction":"JSONPATHARRAY(data, '$.subjects[*].score')"
                },
                {
                   "columnName":"homeworkGrades",
                   "transformFunction":"JSONPATHARRAY(data, '$.subjects[*].homework_grades[1]')"
                }
             ]
          }
       }
    }
    {
      "data": {
        "name": "Pete",
        "age": 24,
        "subjects": [
          {
            "name": "maths",
            "homework_grades": [80, 85, 90, 95, 100],
            "grade": "A",
            "score": 90
          },
          {
            "name": "english",
            "homework_grades": [60, 65, 70, 85, 90],
            "grade": "B",
            "score": 70
          }
        ]
      }
    }
    {
       "tableConfig":{
          "ingestionConfig":{
             "transformConfigs":[
                {
                   "columnName":"age",
                   "transformFunction":"JSONPATHDOUBLE(data, '$.age')"
                }
             ]
          }
       }
    }
    {
      "data": {
        "name": "Pete",
        "age": 24,
        "subjects": [
          {
            "name": "maths",
            "homework_grades": [80, 85, 90, 95, 100],
            "grade": "A",
            "score": 90
          },
          {
            "name": "english",
            "homework_grades": [60, 65, 70, 85, 90],
            "grade": "B",
            "score": 70
          }
        ]
      }
    }
    {
       "tableConfig":{
          "ingestionConfig":{
             "transformConfigs":[
                {
                   "columnName":"age",
                   "transformFunction":"JSONPATHLONG(data, '$.age')"
                }
             ]
          }
       }
    }
    select mode(yearID, 'AVG') AS value
    from baseballStats 
    WHERE AtBatting != 0 AND yearID > 2001
    select mode(yearID, 'MIN') AS value
    from baseballStats 
    WHERE AtBatting != 0 AND yearID > 2001
    select mode(yearID, 'MAX') AS value
    from baseballStats 
    WHERE AtBatting != 0 AND yearID > 2001
    select quarter(1633046399000, 'UTC') AS quarter
    FROM ignoreMe
    select quarter(1633046399000, 'CET') AS quarter
    FROM ignoreMe
    select distinctCountRawThetaSketch(teamID) AS value
    from baseballStats 
    select distinctCountRawThetaSketch(teamID, 'nominalEntries=10') AS value
    from baseballStats
    select distinctCountRawThetaSketch(
      yearID, 
      'nominalEntries=4096', 
      'teamID = ''SFN'' AND numberOfGames=28 AND homeRuns=1',
      'teamID = ''CHN'' AND numberOfGames=28 AND homeRuns=1',
      'SET_INTERSECT($1, $2)'
    ) AS value
    from baseballStats 
    JSONPATHARRAYDEFAULTEMPTY(jsonField, 'jsonPath')
    Arguments
    Description

    jsonField

    An Identifier/Expression contains JSON documents.

    'jsonPath'

    Follows to read values from JSON documents.

    circle-exclamation

    'jsonPath'` is a literal. Pinot uses single quotes to distinguish them from identifiers. You can use the Jayway JsonPath Evaluator Toolarrow-up-right to test JSON expressions before you import any data.

    hashtag
    Usage Examples

    The usage examples are based on extracting fields from the following JSON document:

    Expression
    Value

    JSONPATHARRAYDEFAULTEMPTY(myJsonRecord, '$.subjects[*].name')

    ["maths", "english"]

    JSONPATHARRAYDEFAULTEMPTY(myJsonRecord, '$.subjects[*].score')

    [90, 70]

    JSONPATHARRAYDEFAULTEMPTY(myJsonRecord, '$.subjects[*].homework_grades[1]')

    [85, 65]

    JSONPATHARRAYDEFAULTEMPTY(myJsonRecord, '$.subjects[*].homework_grades[7]')

    []

    This function can be used in the table config to extract the name, score, and second value of homework_grades into their respective columns , as described below:

    ingestion transformation function
    JSONPATHSTRING(jsonField, 'jsonPath', [defaultValue])
    Arguments
    Description

    jsonField

    An Identifier/Expression contains JSON documents.

    'jsonPath'

    Follows to read values from JSON documents.

    circle-exclamation

    'jsonPath'` is a literal. Pinot uses single quotes to distinguish them from identifiers. You can use the Jayway JsonPath Evaluator Toolarrow-up-right to test JSON expressions before you import any data.

    hashtag
    Usage Examples

    The usage examples are based on extracting fields from the following JSON document:

    Expression
    Value

    JSONPATHSTRING(data, '$.age')

    "24"

    JSONPATHSTRING(data, '$.name["nick.name"]')

    "Pete"

    This function can be used in the table config to extract the age property into the age column, as described below:

    ingestion transformation function
    Arguments
    Description

    jsonField

    An Identifier/Expression contains JSON documents.

    'jsonPath'

    Follows to read values from JSON documents.

    'results_type'

    One of the Pinot supported data types:INT, LONG, FLOAT, DOUBLE, BOOLEAN, TIMESTAMP, STRING,

    INT_ARRAY, LONG_ARRAY, FLOAT_ARRAY, DOUBLE_ARRAY, STRING_ARRAY.

    circle-exclamation

    'jsonPath'and`` ``'results_type'are literals. Pinot uses single quotes to distinguish them from identifiers.

    hashtag
    Usage Examples

    The examples in this section are based on the Batch JSON Quick Start. In particular we'll be querying the row WHERE id = 7044874109:

    repo

    {"id":115911530,"name":"LimeVista/Tapes","url":"https://api.github.com/repos/LimeVista/Tapes"}

    The following examples show how to use the JSONEXTRACTSCALAR function:

    id
    name

    7044874109

    LimeVista/Tapes

    id
    name

    7044874109

    dummyValue

    JSONPATH

    This section contains reference documentation for the JSONPATH function.

    Extracts the object value from jsonField based on 'jsonPath', the result type is inferred based on JSON value. This function can only be used in an ingestion transformation function.

    hashtag
    Signature

    JSONPATH(jsonField, 'jsonPath')

    Arguments
    Description
    circle-exclamation

    'jsonPath'` is a literal. Pinot uses single quotes to distinguish them from identifiers. You can use the to test JSON expressions before you import any data.

    hashtag
    Usage Examples

    The usage examples are based on extracting fields from the following JSON document:

    Expression
    Value

    This function can be used in the to extract the name property into the name column and age property into the age column, as described below:

    {
      "data": {
        "name": "Pete",
        "age": 24,
        "subjects": [
          {
            "name": "maths",
            "homework_grades": [80, 85, 90, 95, 100],
            "grade": "A",
            "score": 90
          },
          {
            "name": "english",
            "homework_grades": [60, 65, 70, 85, 90],
            "grade": "B",
            "score": 70
          }
        ]
      }
    }
    {
       "tableConfig":{
          "ingestionConfig":{
             "transformConfigs":[
                {
                   "columnName":"names",
                   "transformFunction":"JSONPATHARRAYDEFAULTEMPTY(data, '$.subjects[*].name')"
                },
                {
                   "columnName":"ages",
                   "transformFunction":"JSONPATHARRAYDEFAULTEMPTY(data, '$.subjects[*].score')"
                },
                {
                   "columnName":"homeworkGrades",
                   "transformFunction":"JSONPATHARRAYDEFAULTEMPTY(data, '$.subjects[*].homework_grades[1]')"
                }
             ]
          }
       }
    }
    {
      "data": {
        "name": {"full.name": "Peter", "nick.name": "Pete"},
        "age": 24,
        "subjects": [
          {
            "name": "maths",
            "homework_grades": [80, 85, 90, 95, 100],
            "grade": "A",
            "score": 90
          },
          {
            "name": "english",
            "homework_grades": [60, 65, 70, 85, 90],
            "grade": "B",
            "score": 70
          }
        ]
      }
    }
    {
       "tableConfig":{
          "ingestionConfig":{
             "transformConfigs":[
                {
                   "columnName":"age",
                   "transformFunction":"JSONPATHSTRING(data, '$.age')"
                },
                {
                   "columnName":"nickName",
                   "transformFunction":"JSONPATHSTRING(data, '$.name[\"nick.name\"]')"
                }
             ]
          }
       }
    }
    select repo
    from githubEvents 
    WHERE id = 7044874109
    select id, jsonextractscalar(repo, '$.name', 'STRING') AS name
    from githubEvents 
    WHERE id = 7044874109
    select id, jsonextractscalar(repo, '$.foo', 'STRING') AS name
    from githubEvents 
    WHERE id = 7044874109
    [
      {
        "message": "QueryExecutionError:\njava.lang.RuntimeException: Illegal Json Path: [$.foo], when reading [{\"id\":115911530,\"name\":\"LimeVista/Tapes\",\"url\":\"https://api.github.com/repos/LimeVista/Tapes\"}]\n\tat org.apache.pinot.core.operator.transform.function.JsonExtractScalarTransformFunction.transformToStringValuesSV(JsonExtractScalarTransformFunction.java:254)\n\tat org.apache.pinot.core.operator.docvalsets.TransformBlockValSet.getStringValuesSV(TransformBlockValSet.java:90)\n\tat org.apache.pinot.core.common.RowBasedBlockValueFetcher.createFetcher(RowBasedBlockValueFetcher.java:64)\n\tat org.apache.pinot.core.common.RowBasedBlockValueFetcher.<init>(RowBasedBlockValueFetcher.java:32)",
        "errorCode": 200
      }
    ]
    select id, jsonextractscalar(repo, '$.foo', 'STRING', 'dummyValue') AS name
    from githubEvents 
    WHERE id = 7044874109
    JsonPath Syntaxarrow-up-right
    JsonPath Syntaxarrow-up-right
    JsonPath Syntaxarrow-up-right

    jsonField

    An Identifier/Expression contains JSON documents.

    'jsonPath'

    Follows JsonPath Syntaxarrow-up-right to read values from JSON documents.

    JSONPATH(data, '$.name')

    "Pete"

    JSONPATH(data, '$.age')

    24

    Jayway JsonPath Evaluator Toolarrow-up-right
    table config
    {
      "data": {
        "name": "Pete",
        "age": 24,
        "subjects": [
          {
            "name": "maths",
            "homework_grades": [80, 85, 90, 95, 100],
            "grade": "A",
            "score": 90
          },
          {
            "name": "english",
            "homework_grades": [60, 65, 70, 85, 90],
            "grade": "B",
            "score": 70
          }
        ]
      }
    }
    {
       "tableConfig":{
          "ingestionConfig":{
             "transformConfigs":[
                {
                   "columnName":"name",
                   "transformFunction":"JSONPATHSTRING(data, '$.name')"
                },
                {
                   "columnName":"age",
                   "transformFunction":"JSONPATHSTRING(data, '$.age')"
                }
             ]
          }
       }
    }

    DATETIMECONVERT

    This section contains reference documentation for the DATETIMECONVERT function.

    Converts the value from a column that contains an epoch timestamp into another time unit and buckets based on the given time granularity.

    hashtag
    Signature

    DATETIMECONVERT(columnName, inputFormat, outputFormat, outputGranularity)

    inputFormat and outputFormat are defined using the following structure:

    <time size>:<time unit>:<time format>:<pattern>

    where:

    • time size - size of the time unit eg: 1, 10

    • time unit - DAYS, HOURS, MINUTES, SECONDS, MILLISECONDS

    granularity is specified in the format <time size>:<time unit>.

    hashtag
    Usage Examples

    These examples are based on the .

    created_at_timestamp from milliseconds since epoch to days since epoch, bucketed to 1 day granularity:

    id
    created_at_timestamp
    timeInMs
    convertedTime

    created_at_timestamp bucketed to 15 minutes granularity:

    id
    created_at_timestamp
    timeInMs
    convertedTime

    created_at_timestamp to format yyyy-MM-dd, bucketed to 1 days granularity:

    id
    created_at_timestamp
    timeInMs
    convertedTime

    created_at_timestamp to format yyyy-MM-dd HH:mm, in timezone Pacific/Kiritimati:

    id
    created_at_timestamp
    timeInMs
    convertedTime

    created_at_timestamp to format yyyy-MM-dd, in timezone Pacific/Kiritimati and bucketed to 1 day granularity:

    id
    created_at_timestamp
    timeInMs
    convertedTime

    regexpReplace

    This section contains reference documentation for the regexpReplace function

    Find and replace a string or regexp pattern with a target string or regexp pattern. If matchStr is not found, inputStr will be returned. By default, all occurrences of match pattern in the input string will be replaced. Default matching mode is case sensitive.

    hashtag
    Signature

    regexpReplace(inputStr, matchRegexp, replaceRegexp)

    ,
    MICROSECONDS
    ,
    NANOSECONDS
  • time format

    • EPOCH

    • SIMPLE_DATE_FORMAT pattern - defined in case of SIMPLE_DATE_FORMAT e.g. yyyy-MM-dd. A specific timezone can be passed using tz(timezone). Timezone can be long or short string format timezone. e.g. Asia/Kolkata or PDT

  • 7044874109

    2018-01-01 11:00:00.0

    1514804402000

    17532

    7044874109

    2018-01-01 11:00:00.0

    1514804402000

    1514804400000

    7044874109

    2018-01-01 11:00:00.0

    1514804402000

    2018-01-01

    7044874109

    2018-01-01 11:00:00.0

    1514804402000

    2018-01-02 01:00

    7044874109

    2018-01-01 11:00:00.0

    1514804402000

    2018-01-02 00:00

    Batch JSON Quick Start
    regexpReplace(inputStr, matchRegexp, replaceRegexp, matchStartPos)

    regexpReplace(inputStr, matchRegexp, replaceRegexp, matchStartPos, occurrence)

    regexpReplace(inputStr, matchRegexp, replaceRegexp, matchStartPos, occurrence, flag)

    hashtag
    inputStr

    The input string or the column name on which regexpReplace function should be applied.

    hashtag
    matchRegexp

    The regular expression or string used to match against the input string or column value.

    hashtag
    replaceRegexp

    The regular expression or string to replace if a match is found.

    hashtag
    matchStartPos

    Index of inputStr from where matching should start. Counting starts and 0. Default value is 0 if not specified.

    hashtag
    occurrence

    Controls which occurence of the matched pattern must be replaced. Counting starts at 0. Default value is -1 if not specified

    hashtag
    flag

    Single character flag that controls how the regex finds matches in inputStr. If an incorrect flag is specified, the function applies default case sensitive match. Only one flag can be specified. Supported flags are:

    • i -> case insensitive match

    hashtag
    Usage Examples

    hashtag
    Example 1

    In the example below, shows a simple string find and replace example where all occurrences of the matched string o is replaced with string x.

    value

    fxx

    hashtag
    Example 2

    The example below shows how a regexp pattern containing consecutive digits is found and replaced with a simple string bar.

    value

    foobar

    hashtag
    Example 3

    The example below shows how a regexp pattern containing consecutive non-digits is found and replaced with a simple string bar.

    value

    bar123

    hashtag
    Example 4

    The following example demonstrates how replaceStr can contain backreferences to substrings captured by the matchStr regular expression. Backreferences are indicated by $n where n can range from 0-9. In the example below, every character in the input is replaced by the character appended with a space.

    value

    f o o

    hashtag
    Example 5

    This example shows how regexpReplace can be used to remove extra whitespaces between words in an input string.

    value

    Pinot is blazing fast

    hashtag
    Example 6

    This example shows the power of backreferencing can be used in regexpReplace to format phone numbers.

    value

    1-(123) 456-7898

    hashtag
    Example 7

    This example shows how the matchStartPos parameter can be used. Since the matchStartPos is set to 4, pattern matching against the inputStr begins at index 4 there by leading to the string healthy not being replaced.

    value

    healthy, something, something and wise

    hashtag
    Example 8

    This example shows how the occurence parameter can be used. In the example below, the matchStr regular expression matches against three instances in the input - healthy, wealthy and stealthy. As the occurence is specified to 2, the second occurence (counting from zero) stealthy is replaced with something

    value

    healthy, wealthy, something and wise

    hashtag
    Example 9

    The example below shows the usage of the flag parameter. Here the case insensitive flag i is specified.

    value

    something, wealthy, stealthy and wise

    hashtag
    Example 10

    The examples below show some sample queries using regexpReplace in there WHERE clause of a query.

    select id, 
           created_at_timestamp, 
           cast(created_at_timestamp AS long) AS timeInMs,
           DATETIMECONVERT(
             created_at_timestamp, 
             '1:MILLISECONDS:EPOCH', 
             '1:DAYS:EPOCH', 
             '1:DAYS'
           ) AS convertedTime
    from githubEvents
    WHERE id = 7044874134
    select id, 
           created_at_timestamp, 
           cast(created_at_timestamp AS long) AS timeInMs,
           DATETIMECONVERT(
             created_at_timestamp, 
             '1:MILLISECONDS:EPOCH', 
             '1:MILLISECONDS:EPOCH', 
             '15:MINUTES'
           ) AS convertedTime
    from githubEvents
    WHERE id = 7044874134
    select id, 
           created_at_timestamp, 
           cast(created_at_timestamp AS long) AS timeInMs,
           DATETIMECONVERT(
             created_at_timestamp, 
             '1:MILLISECONDS:EPOCH', 
             '1:DAYS:SIMPLE_DATE_FORMAT:yyyy-MM-dd', 
             '1:DAYS'
           ) AS convertedTime
    from githubEvents
    WHERE id = 7044874134
    select id, 
           created_at_timestamp, 
           cast(created_at_timestamp AS long) AS timeInMs,
           DATETIMECONVERT(
             created_at_timestamp, 
             '1:MILLISECONDS:EPOCH', 
             '1:MILLISECONDS:SIMPLE_DATE_FORMAT:yyyy-MM-dd HH:mm tz(Pacific/Kiritimati)', 
             '1:MILLISECONDS'
           ) AS convertedTime
    from githubEvents
    WHERE id = 7044874134
    select id, 
           created_at_timestamp, 
           cast(created_at_timestamp AS long) AS timeInMs,
           DATETIMECONVERT(
             created_at_timestamp, 
             '1:MILLISECONDS:EPOCH', 
             '1:MILLISECONDS:SIMPLE_DATE_FORMAT:yyyy-MM-dd HH:mm tz(Pacific/Kiritimati)', 
             '1:DAYS'
           ) AS convertedTime
    from githubEvents
    WHERE id = 7044874134
    select regexpReplace('foo', 'o', 'x') AS value
    from myTable
    select regexpReplace('foo123', '[0-9]+', 'bar') AS value
    from myTable
    select regexpReplace('foo123', '[^0-9]+', 'bar') AS value
    from myTable
    select regexpReplace('foo', '(.)', '$1 ') AS value
    from myTable
    select regexpReplace('Pinot is  blazing  fast', '( ){2,}', ' ') AS value
    from myTable
    select regexpReplace('11234567898','(\\d)(\\d{3})(\\d{3})(\\d{4})', '$1-($2) $3-$4') AS value
    from myTable
    select regexpReplace('healthy, wealthy, stealthy and wise','\\w+thy', 'something', 4)  AS value
    from myTable
    select regexpReplace('healthy, wealthy, stealthy and wise','\\w+thy', 'something', 0, 2)  AS value
    from myTable
    select regexpReplace('healthy, wealthy, stealthy and wise','\\w+THY', 'something', 0, 0, 'i')  AS value
    from myTable
    SELECT col1, col2
    FROM myTable
    WHERE regexpReplace(stateCode, '[VC]A', 'TEST') = 'TEST'
    SELECT count(*)
    FROM myTable
    WHERE contains(regexpReplace(stateCode, '(C)(A)', '$1TEST$2'), 'CTESTA')