arrow-left

All pages
gitbookPowered by GitBook
triangle-exclamation
Couldn't generate the PDF for 134 pages, generation stopped at 100.
Extend with 50 more pages.
1 of 100

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

ABS

This section contains reference documentation for the abs function.

Absolute of a value

hashtag
Signature

ABS(col1)

hashtag
Usage Examples

value
value

ADD

This section contains reference documentation for the ADD function.

Sum of at least two values

hashtag
Signature

ADD(col1, col2, col3...)

hashtag
Usage Examples

These examples are based on the .

homeRuns
baseOnBalls
total

arrayConcatInt

This section contains reference documentation for the arrayConcatInt function.

Concatenates two arrays of ints.

hashtag
Signature

arrayConcatInt('colName1', 'colName2')

hashtag
Usage Examples

These examples are based on the .

DivWheelsOffs
concatIds

codepoint

This section contains reference documentation for the CODEPOINT function.

the Unicode codepoint of the first character of the string

hashtag
Signature

CODEPOINT(col)

select ABS(-12.1) AS value
from ignoreMe

12.1

select ABS(12.1) AS value
from ignoreMe

12.1

select homeRuns, baseOnBalls, ADD(homeRuns, baseOnBalls) AS total
from baseballStats 
WHERE teamID = 'ML1' 
AND yearID = 1956 
AND playerName = 'Henry Louis'

26

37

63

Batch Quick Start

1453,1731

1453,1731,1415,1623

1908,1758

1908,1758,1339,2310

1453,1731

1453,1731,1415,1623

1908,1758

1908,1758,1339,2310

Hybrid Quick Start
hashtag
Usage Examples
value

65

SELECT CODEPOINT('Apache Pinot') AS value
FROM ignoreMe
select DivWheelsOffs, 
       arrayConcatInt(DivWheelsOffs, DivWheelsOns) AS concatIds
from airlineStats 
WHERE arraylength(DivWheelsOffs) >= 2
limit 5

arrayContainsInt

This section contains reference documentation for the arrayContainsInt function.

Checks if int value exists in array.

hashtag
Signature

arrayContainsInt('colName', valueToFind)

hashtag
Usage Examples

These examples are based on the .

DivAirportIDs
containsValue

arrayContainsString

This section contains reference documentation for the arrayContainsString function.

Checks if string value exists in array.

hashtag
Signature

arrayContainsString('colName', valueToFind)

hashtag
Usage Examples

These examples are based on the .

DivTailNums
index

arrayDistinctInt

This section contains reference documentation for the arrayDistinctInt function.

Returns unique values in an array of ints.

hashtag
Signature

arrayDistinctInt('colName')

hashtag
Usage Examples

These examples are based on the .

DivAirportIDs
unique

arrayIndexOfString

This section contains reference documentation for the arrayIndexOfString function.

Finds the last index of the given value in the array starting at the given index.

hashtag
Signature

arrayIndexOfString('colName', valueToFind)

hashtag
Usage Examples

These examples are based on the .

DivTailNums
index

arrayIndexOfInt

This section contains reference documentation for the arrayIndexOfInt function.

Finds the last index of the given value in the array starting at the given index.

hashtag
Signature

arrayIndexOfInt('colName', valueToFind)

hashtag
Usage Examples

These examples are based on the .

DivAirportIDs
index

arrayReverseInt

This section contains reference documentation for the arrayReverseInt function.

Reverses array of ints.

hashtag
Signature

arrayReverseInt('colName')

hashtag
Usage Examples

These examples are based on the .

DivAirportIDs
reversedIds

arraySliceInt

This section contains reference documentation for the arraySliceInt function.

Returns the values in the array between the start and end positions.

hashtag
Signature

arraySliceInt('colName', start, end)

hashtag
Usage Examples

These examples are based on the .

FlightNum
airports
DivAirportIDs

arrayUnionInt

This section contains reference documentation for the arrayUnionInt function.

Create a union of two arrays of ints.

hashtag
Signature

arrayUnionInt('colName1', 'colName2')

hashtag
Usage Examples

These examples are based on the .

DivWheelsOffs
DivWheelsOns
unionIds

AVGMV

This section contains reference documentation for the AVGMV function.

Get the avg of values in a group

hashtag
Signature

AVGMV(colName)

hashtag
Usage Examples

These examples are based on the .

value

ceil

This section contains reference documentation for the CEIL function.

Rounded up to the nearest integer.

hashtag
Signature

CEIL(col1)

hashtag
Usage Examples

value
value

CHR

This section contains reference documentation for the CHR function.

the character corresponding to the Unicode codepoint

hashtag
Signature

CHR(codepoint)

hashtag
Usage Examples

value

COUNTMV

This section contains reference documentation for the COUNTMV function.

Get the count of rows in a group

hashtag
Signature

COUNTMV(colName)

hashtag
Usage Examples

These examples are based on the .

The following query returns the documents that have a DivTailNums with more than one value:

DivTailNums

You can count the number of items in these rows by running the following query:

value

day

This section contains reference documentation for the day function.

Returns the day of the month from the given epoch millis in UTC or specified timezone. The value ranges from 1 to 31.

hashtag
Signature

day(tsInMillis)

day(tsInMillis, timeZoneId)

dayOfMonth(tsInMillis)

dayOfMonth(tsInMillis, timeZoneId)

hashtag
Usage Examples

day
day
day
day

DISTINCT

This section contains reference documentation for the DISTINCT function.

Returns the distinct row values in a group

hashtag
Signature

DISTINCT(colName)

hashtag
Usage Examples

These examples are based on the .

value
value

DISTINCTCOUNTBITMAPMV

This section contains reference documentation for the DISTINCTCOUNTHLLMV function.

Returns an approximate distinct count using HyperLogLog in a group

hashtag
Signature

DISTINCTCOUNTHLLMV(colName)

hashtag
Usage Examples

These examples are based on the .

value

DISTINCTCOUNTHLL

This section contains reference documentation for the DISTINCTCOUNTHLL function.

Returns an approximate distinct count using HyperLogLog. It also takes an optional second argument to configure the log2m for the HyperLogLog. For accurate distinct counting, see DISTINCTCOUNT.

hashtag
Signature

DISTINCTCOUNTHLL(colName, log2m)

hashtag
Usage Examples

These examples are based on the .

value
value

DISTINCTCOUNTHLLMV

This section contains reference documentation for the DISTINCTCOUNTBITMAPMV function.

Returns the count of distinct row values in a group. This function is accurate for INT or dictionary encoded column, but approximate for other cases where hash codes are used in distinct counting and there may be hash collision.

hashtag
Signature

DISTINCTCOUNTBITMAPMV(colName)

hashtag
Usage Examples

These examples are based on the .

value
value

DISTINCTCOUNTMV

This section contains reference documentation for the DISTINCTCOUNTMV function.

Returns the count of distinct row values in a group

hashtag
Signature

DISTINCTCOUNTMV(colName)

hashtag
Usage Examples

These examples are based on the .

The following query returns the documents that have a DivTailNums with more than one value:

DivTailNums

You can count the distinct number of items in these rows by running the following query:

value

DIV

This section contains reference documentation for the DIV function.

Quotient of two values

hashtag
Signature

DIV(col1, col2)

hashtag
Usage Examples

These examples are based on the .

homeRuns
numberOfGames
total

FLOOR

This section contains reference documentation for the FLOOR function.

Rounded down to the nearest integer.

hashtag
Signature

FLOOR(col1)

hashtag
Usage Examples

value
value

FromEpochBucket

This section contains reference documentation for the fromEpochBucket functions.

Convert epoch to epoch milliseconds. e.g. 10 seconds since epoch or 5 minutes since Epoch. The following time units are supported:

  • SECONDS

  • MINUTES

  • HOURS

  • DAYS

hashtag
Signature

FromEpoch<TIME_UNIT>Bucket(timeInMillis, bucketSize)

hashtag
Usage Examples

bucket
bucket
bucket
bucket
bucket

hour

This section contains reference documentation for the hour function.

Returns the hour of the day from the given epoch millis in UTC or specified timezone. The value ranges from 0 to 23.

hashtag
Signature

hour(tsInMillis)

hour(tsInMillis, timeZoneId)

hashtag
Usage Examples

hour
hour

jsonextractkey

This section contains reference documentation for the JSONEXTRACTKEY function.

Extracts all matched JSON field keys based on 'jsonPath' into a STRING_ARRAY.

hashtag
Signature

JSONEXTRACTKEY(jsonField, 'jsonPath')

Arguments
Description
circle-exclamation

'jsonPath'` is a literal. Pinot uses single quotes to distinguish them from identifiers.

hashtag
Usage Examples

The examples in this section are based on the . In particular we'll be querying the row WHERE id = 7044874109.

id
repo
keys

minmaxrange

This section contains reference documentation for the minmaxrange function.

Returns the max - min value in a group

hashtag
Signature

MINMAXRANGE(colName)

hashtag
Usage Examples

These examples are based on the .

value

count

This section contains reference documentation for the count function.

Get the count of rows in a group

hashtag
Signature

COUNT(colName)

hashtag
Usage Examples

These examples are based on the .

value

lpad

This section contains reference documentation for the LPAD function.

string padded from the left side with pad to reach final size

hashtag
Signature

LPAD(col, size, pad)

hashtag
Usage Examples

value

MINMV

This section contains reference documentation for the MINMV function.

Get the minimum value in a group

hashtag
Signature

MINMV(colName)

hashtag
Usage Examples

These examples are based on the .

value

MOD

This section contains reference documentation for the MOD function.

Modulo of two values

hashtag
Signature

MOD(col1, col2)

hashtag
Usage Examples

value
value

ln

This section contains reference documentation for the ln function.

Natural log of value i.e. ln(col1)

hashtag
Signature

LN(col1)

hashtag
Usage Examples

value
value

remove

This section contains reference documentation for the remove function.

Removes all instances of search from string

hashtag
Signature

remove(input, search)

hashtag
Usage Examples

value

ltrim

This section contains reference documentation for the ltrim function.

trim spaces from left side of the string

hashtag
Signature

LTRIM(col)

hashtag
Usage Examples

notTrimmed
trimmed

reverse

This section contains reference documentation for the reverse function.

Reverse the string

hashtag
Signature

REVERSE(col)

hashtag
Usage Examples

name

exp

This section contains reference documentation for the exp function.

Euler’s number(e) raised to the power of col.

hashtag
Signature

EXP(col1)

hashtag
Usage Examples

value
value

rtrim

This section contains reference documentation for the rtrim function.

rtrim spaces from right side of the string

hashtag
Signature

RTRIM(col)

hashtag
Usage Examples

notTrimmed
trimmed

rpad

This section contains reference documentation for the RPAD function.

string padded from the right side with pad to reach final size

hashtag
Signature

RPAD(col, size, pad)

hashtag
Usage Examples

value

replace

This section contains reference documentation for the REPLACE function.

replace all instances of find with replace in input

hashtag
Signature

REPLACE(col, find, replace)

hashtag
Usage Examples

value
value

MINMAXRANGEMV

This section contains reference documentation for the MINMAXRANGEMV function.

Returns the max - min value in a group

hashtag
Signature

MINMAXRANGEMV(colName)

hashtag
Usage Examples

These examples are based on the .

value

arrayDistinctString

This section contains reference documentation for the arrayDistinctString function.

Returns unique values in an array of strings.

hashtag
Signature

arrayDistinctString('colName')

arrayRemoveString

This section contains reference documentation for the arrayRemoveString function.

Removes value from array of strings.

hashtag
Signature

arrayRemoveString('colName', value)

arrayReverseString

This section contains reference documentation for the arrayReverseString function.

Reverses array of strings.

hashtag
Signature

arrayReverseString('colName')

arraySortInt

This section contains reference documentation for the arraySortInt function.

Sorts array of ints.

hashtag
Signature

arraySortInt('colName')

arraySortString

This section contains reference documentation for the arraySortString function.

Sorts array of strings.

hashtag
Signature

arraySortString('colName')

arrayUnionString

This section contains reference documentation for the arrayUnionString function.

Create a union of two arrays of strings.

hashtag
Signature

arrayUnionString('colName1', 'colName2')

dayOfYear

This section contains reference documentation for the dayOfYear function.

Returns the day of the year from the given epoch millis in UTC or specified timezone. The value ranges from 1 to 366.

hashtag
Signature

dayOfYear(tsInMillis)

dayOfYear(tsInMillis, timeZoneId)

dayOfWeek

This section contains reference documentation for the dayOfWeek function.

Returns the day of the week from the given epoch millis in UTC timezone. The value ranges from 1(Monday) to 7(Sunday).

hashtag
Signature

dayOfWeek(tsInMillis)

dayOfWeek(tsInMillis, timeZoneId)

DISTINCTCOUNT

This section contains reference documentation for the DISTINCTCOUNT function.

Returns the count of distinct row values in a group

hashtag
Signature

DISTINCTCOUNT(colName)

DISTINCTCOUNTBITMAP

This section contains reference documentation for the DISTINCTCOUNTBITMAP function.

Returns the count of distinct row values in a group. This function is accurate for INT column, but approximate for other cases where hash codes are used in distinct counting and there may be hash collisions. For accurate distinct counting on all column types, see .

hashtag
Signature

DISTINCTCOUNTBITMAP(colName)

DISTINCTCOUNTRAWHLL

This section contains reference documentation for the DISTINCTCOUNTRAWHLL function.

Returns HLL response serialized as string. The serialized HLL can be converted back into an HLL and then aggregated with other HLLs. A common use case may be to merge HLL responses from different Pinot tables, or to allow aggregation after client-side batching.

hashtag
Signature

DISTINCTCOUNTRAWHLL(colName, log2m)

lower

This section contains reference documentation for the lower function.

Converts string to lower case.

hashtag
Signature

LOWER(col)

MD5

This section contains reference documentation for the MD5 function.

Return MD5 digest of binary column(bytes type) as hex string

hashtag
Signature

MD5(bytesCol)

minute

This section contains reference documentation for the minute function.

Returns the minute of the hour from the given epoch millis in UTC or specified timezone. The value ranges from 0 to 59.

hashtag
Signature

minute(tsInMillis)

minute(tsInMillis, timeZoneId)

MAXMV

This section contains reference documentation for the MAXMV function.

Get the maximum value in a group

hashtag
Signature

MAXMV(colName)

DISTINCTCOUNTRAWHLLMV

This section contains reference documentation for the DISTINCTCOUNTRAWHLLMV function.

Returns HLL response serialized as string. The serialized HLL can be converted back into an HLL and then aggregated with other HLLs. A common use case may be to merge HLL responses from different Pinot tables, or to allow aggregation after client-side batching.

hashtag
Signature

DISTINCTCOUNTRAWHLLMV(colName, log2m)

month

This section contains reference documentation for the month function.

Returns the month of the year from the given epoch millis in UTC or specified timezone. The value ranges from 1 to 12.

hashtag
Signature

month(tsInMillis)

month(tsInMillis, timeZoneId)

now

This section contains reference documentation for the now function.

Return current time as epoch millis.

hashtag
Signature

now()

JSONFORMAT

This section contains reference documentation for the JSONFORMAT function.

Extracts the object value from jsonField based on 'jsonPath', the result type is inferred based on JSON value. This function can only be used in an .

hashtag
Signature

JSONFORMAT(object)

length

This section contains reference documentation for the length function.

calculate length of the string

hashtag
Signature

LENGTH(col)

min

This section contains reference documentation for the min function.

Get the minimum value in a group

hashtag
Signature

MIN(colName)

sha

This section contains reference documentation for the SHA function.

Return SHA-1 digest of binary column(bytes type) as hex string

hashtag
Signature

SHA(bytesCol)

FromDateTime

This section contains reference documentation for the FromDateTime function.

Converts a formatted date-time string to milliseconds, based on the provided .

hashtag
Signature

FromDateTime(dateTimeString, pattern)

percentile

This section contains reference documentation for the percentile function.

Returns the max - min value in a group

hashtag
Signature

percentile(colName, percentile)

percentileest

This section contains reference documentation for the percentileest function.

Returns the Nth percentile of the group using algorithm.

hashtag
Signature

percentileest(colName, percentile)

quarter

This section contains reference documentation for the quarter function.

Returns the quarter of the year from the given epoch millis in UTC or specified timezone. The value ranges from 1 to 4

hashtag
Signature

quarter(tsInMillis)

quarter(tsInMillis, timeZoneId)

round

This section contains reference documentation for the round function.

Round the given time value to nearest bucket start value.

hashtag
Signature

round(timeValue, bucketSize)

arrayRemoveInt

This section contains reference documentation for the arrayRemoveInt function.

Removes value from array of ints.

hashtag
Signature

arrayRemoveInt('colName', value)

concat

This section contains reference documentation for the concat function.

Concatenate two input strings using the seperator

hashtag
Signature

CONCAT(col1, col2, seperator)

13891,12892

false

14683,14683

true

12339,12339

false

13487,13930

false

13029,11292

false

Hybrid Quick Start
select DivTailNums, 
       arrayContainsString(DivTailNums, 'N7713A') AS index
from airlineStats 
WHERE arraylength(DivTailNums) >= 2
limit 5

N7713A,N7713A

true

N344AA,N344AA

false

N7713A,N7713A

true

Hybrid Quick Start

15016,11066

15016,11066

10620,14869

10620,14869

13891,12892

13891,12892

12264,10397

12264,10397

11066,12892

11066,12892

Hybrid Quick Start
select DivTailNums, 
       arrayIndexOfString(DivTailNums, 'N7713A') AS index
from airlineStats 
WHERE arraylength(DivTailNums) >= 2
limit 5

N7713A,N7713A

0

N344AA,N344AA

-1

N7713A,N7713A

0

Hybrid Quick Start

13891,12892

-1

14683,14683

0

12339,12339

-1

13487,13930

-1

13029,11292

-1

Hybrid Quick Start

13891,12892

12892,13891

14683,14683

14683,14683

12339,12339

12339,12339

13487,13930

13930,13487

13029,11292

11292,13029

Hybrid Quick Start

13198

13198,10721

548

10721

10721,12478

1531

13891

13891,12892

19

14683

14683,14683

829

12339

12339,12339

Hybrid Quick Start

24

1339,2310

1908,1758,1339,2310

1453,1731

1415,1623

1453,1731,1415,1623

1908,1758

1339,2310

1908,1758,1339,2310

1453,1731

1415,1623

1453,1731,1415,1623

Hybrid Quick Start

1908,1758

select AVGMV(DivLongestGTimes) AS value
from airlineStats 
where arraylength(DivLongestGTimes) > 1

18.465753424657535

Hybrid Quick Start
select CEIL(12.1) AS value
from ignoreMe

13

select CEIL(-12.1) AS value
from ignoreMe

-12

SELECT CHR(65) AS value
FROM ignoreMe

A

select DivTailNums
from airlineStats 
where arraylength(DivTailNums) > 1

N7713A,N7713A

N344AA,N344AA

N344AA,N344AA

N7713A,N7713A

8

Hybrid Quick Start
select day(1639351800000) AS day
FROM ignoreMe

12

13

12

13

NL

UA

AL

NA

PL

AA

FL

NL

UA

AL

NA

PL

AA

FL

Batch Quick Start
select DISTINCTCOUNTHLLMV(DivLongestGTimes) AS value
from airlineStats 
where arraylength(DivLongestGTimes) > 1

34

Hybrid Quick Start
select DISTINCTCOUNTHLL(teamID) AS value
from baseballStats 

158

149

Batch Quick Start
select DISTINCTCOUNTBITMAPMV(DivLongestGTimes) AS value
from airlineStats 
where arraylength(DivLongestGTimes) > 1

34

2

Hybrid Quick Start
select DivTailNums
from airlineStats 
where arraylength(DivTailNums) > 1

N7713A,N7713A

N344AA,N344AA

N344AA,N344AA

N7713A,N7713A

2

Hybrid Quick Start
select homeRuns, numberOfGames, DIV(homeRuns, numberOfGames) AS total
from baseballStats 
WHERE teamID = 'ML1' 
AND yearID = 1956 
AND playerName = 'Henry Louis'

26

153

0.16993464052287582

Batch Quick Start
select FLOOR(12.1) AS value
from ignoreMe

12

select FLOOR(-12.1) AS value
from ignoreMe

-13

1613472303000

3226944606000

1613472000000

1613466000000

1613088000000

select hour(1639351800000) AS hour
FROM ignoreMe

23

select hour(1639351800000, 'CET') AS hour
FROM ignoreMe

0

jsonField

An Identifier/Expression contains JSON documents.

'jsonPath'

Follows JsonPath Syntaxarrow-up-right to read values from JSON documents.

7044874109

{"id":115911530,"name":"LimeVista/Tapes","url":"https://api.github.com/repos/LimeVista/Tapes"}

$['id'],$['name'],$['url']

Batch JSON Quick Start
select MINMAXRANGE(yearID) AS value
from baseballStats 

142

Batch Quick Start
select count(*) AS value
from baseballStats 

97889

Batch Quick Start
SELECT LPAD('Hello, World', '20', '*') AS value
FROM ignoreMe

********Hello, World

select MINMV(DivLongestGTimes) AS value
from airlineStats 
where arraylength(DivLongestGTimes) > 1

2

Hybrid Quick Start
select MOD(12, 5) AS value
from ignoreMe

2

select MOD(12, 2) AS value
from ignoreMe

0

select ln(1) AS value
from ignoreMe

0

select ln(12) AS value
from ignoreMe

2.4849066497880004

select remove('foo bar foo sheep', 'foo') AS value
from ignoreMe

bar sheep

SELECT ' Pinot with spaces  ' AS notTrimmed,
       ltrim(' Pinot with spaces ') AS trimmed
FROM ignoreMe

" Pinot with spaces "

"Pinot with spaces "

select REVERSE('Pinot') AS name
FROM ignoreMe

toniP

select EXP(1) AS value
from ignoreMe

2.718281828459045

select EXP(12) AS value
from ignoreMe

162754.79141900392

SELECT ' Pinot with spaces  ' AS notTrimmed,
       rtrim(' Pinot with spaces ') AS trimmed
FROM ignoreMe

" Pinot with spaces "

" Pinot with spaces"

SELECT RPAD('Hello, World', '20', '*') AS value
FROM ignoreMe

Hello, World********

SELECT REPLACE('Hello, World', 'Hello', 'Goodbye') AS value
FROM ignoreMe

Goodbye, World

SELECT REPLACE('Hello, World', 'Hellow', 'Goodbye') AS value
FROM ignoreMe

Hello, World

select MINMAXRANGEMV(DivLongestGTimes) AS value
from airlineStats 
where arraylength(DivLongestGTimes) > 1

106

Hybrid Quick Start
hashtag
Usage Examples

These examples are based on the Hybrid Quick Start.

DivTailNums
unique

N7713A,N7713A

N7713A

N344AA,N344AA

N344AA

N344AA,N344AA

N344AA

N7713A,N7713A

N7713A

hashtag
Usage Examples

These examples are based on the Hybrid Quick Start.

DivAirportIDs
value

SEA,PSC

PSC

SEA,PSC,PHX,MSY

PSC,PHX,MSY

SEA,PSC,PHX,MSY

PSC,PHX,MSY

SEA,PSC

PSC

SEA,PSC

PSC

hashtag
Usage Examples

These examples are based on the Hybrid Quick Start.

DivAirportIDs
sortedIds

13891,12892

12892,13891

14683,14683

14683,14683

12339,12339

12339,12339

13198,10721

10721,13198

10721,12478

10721,12478

doy(tsInMillis)

doy(tsInMillis, timeZoneId)

hashtag
Usage Examples

dayOfYear

346

dayOfYear

347

dayOfYear

346

dayOfYear

347

select dayOfYear(1639351800000) AS dayOfYear
FROM ignoreMe
dow(tsInMillis)

dow(tsInMillis, timeZoneId)

hashtag
Usage Examples

dayOfWeek

7

dayOfWeek

1

dayOfWeek

7

dayOfWeek

1

select dayOfWeek(1639351800000) AS dayOfWeek
FROM ignoreMe
hashtag
Usage Examples

These examples are based on the Batch Quick Start.

value

7

value

149

select DISTINCTCOUNT(league) AS value
from baseballStats 
select DISTINCTCOUNT(teamID) AS value
from baseballStats 
hashtag
Usage Examples

These examples are based on the Batch Quick Start.

value

7

value

148

DISTINCTCOUNT
select DISTINCTCOUNTBITMAP(league) AS value
from baseballStats 
hashtag
Usage Examples

These examples are based on the Batch Quick Start.

value

00000008000000ac00000800000084000210000000000020001020220030042002100420002010020210000300008020040180400001300310001863024004220870800004400421040104610220080000020000040000030000800002108420000110400800000106000060000080020000082000218c0002000000020000010200100000018c0006000400022004a0000088000200800000320820021000000221842000000000025088000220080100009420

value

000000010000000400000106

select DISTINCTCOUNTHLL(teamID) AS value
from baseballStats 
hashtag
Usage Examples
name

pinot

select LOWER('Pinot') AS name
FROM ignoreMe
hashtag
Usage Examples

These examples are based on the Real time Quick Start.

event_id
location
hash

282776561

80406178a3d70a3d714041d5c28f5c28f6

92a8b787e81672261aad8afcf9de3aee

circle-info

The row returned will be different if you run this example as the data is ingested in real-time.

select event_id, location, MD5(location) AS hash
from meetupRsvp 
limit 1
hashtag
Usage Examples
minute

30

minute

0

select minute(1639351800000) AS minute
FROM ignoreMe
select minute(1639351800000, 'America/St_Johns') AS minute
FROM ignoreMe
hashtag
Usage Examples

These examples are based on the Hybrid Quick Start.

value

108

select MAXMV(DivLongestGTimes) AS value
from airlineStats 
where arraylength(DivLongestGTimes) > 1
hashtag
Usage Examples

These examples are based on the Hybrid Quick Start.

value

00000008000000ac00000000000000000000000500000020000000000030000202108000040000010000000300010400000000000000000000000463000000000000000000010001041000200000002000000000000000000a00000000028001000000010800000000010000001008000000804000000000020000040000880000000000000000000000000000000000000000000000800000000800020004000000840000000002000000000000000000001400

value

0000000100000004000000e4

select DISTINCTCOUNTRAWHLLMV(DivAirports) AS value
from airlineStats 
where arraylength(DivAirports) > 1
hashtag
Usage Examples
month

9

month

10

select month(1633046399000, 'UTC') AS month
FROM ignoreMe
select month(1633046399000, 'CET') AS month
FROM ignoreMe
hashtag
Usage Examples
now

1639150454255

This function is typically used in predicate to filter on timestamp for recent data. e.g. filter data on recent 1 day(86400 seconds)

select now() AS now
FROM ignoreMe
SELECT * 
FROM tableName
WHERE tsInMillis > now() - 86400000
hashtag
Usage Examples

The usage examples are based on extracting fields from the following JSON document:

Expression
Value

JSONFORMAT(meta)

"{\"age\":12}"

This function can be used in the table config to extract the meta property into the data column, as described below:

ingestion transformation function
{"timestamp": "2019-10-09 21:25:25", "meta": {"age": 12}}
hashtag
Usage Examples
value

5

SELECT length('Pinot') AS value
FROM ignoreMe
hashtag
Usage Examples

These examples are based on the Batch Quick Start.

value

1871

select min(yearID) AS value
from baseballStats 
hashtag
Usage Examples

These examples are based on the Real time Quick Start.

event_id
location
hash

282776561

80406178a3d70a3d714041d5c28f5c28f6

b914583284ac29d2bd3c9d097245b031d99d687d

circle-info

The row returned will be different if you run this example as the data is ingested in real-time.

select event_id, location, SHA(location) AS hash
from meetupRsvp 
limit 1
hashtag
Usage Examples
epochMillis

1565136000000

epochMillis

1565190733000

epochMillis

1565190733000

epochMillis

1565190733000

Joda-Time patternarrow-up-right
hashtag
Usage Examples

These examples are based on the Batch Quick Start.

value

0

value

4

value

46

select percentile(homeRuns, 50) AS value
from baseballStats 
hashtag
Usage Examples

These examples are based on the Batch Quick Start.

value

0

value

4

value

46

Quantile Digestarrow-up-right
select percentileest(homeRuns, 50) AS value
from baseballStats 
hashtag
Usage Examples
quarter

3

quarter

3

quarter

4

select quarter(1633046399000) AS quarter
FROM ignoreMe
hashtag
Usage Examples

Round seconds epoch value to the start value of the 30 seconds bucket to which it belongs.

rounded

1639144260

Round milliseconds epoch value to the start value of the 5,000 milliseconds bucket to which it belongs.

rounded

1639144270000

select round(1639144274, 30) AS rounded
FROM ignoreMe
select round(1639144274000, 5000) AS rounded
FROM ignoreMe
hashtag
Usage Examples

These examples are based on the Hybrid Quick Start.

DivAirportIDs
value

13891,12892

13891

13198,12892

13198

11066,12892

11066

13198,12892

13198

13891,12892

13891

hashtag
Usage Examples
value

Apache Pinot

value

real-time__analytics

SELECT concat('Apache', 'Pinot', ' ') AS value
FROM ignoreMe
SELECT concat('real-time', 'analytics', '__') AS value
FROM ignoreMe
select DivAirportIDs, 
       arrayContainsInt(DivAirportIDs, 14683) AS containsValue
from airlineStats 
WHERE arraylength(DivAirportIDs) >= 2
limit 5
select DivAirportIDs, 
       arrayDistinctInt(DivAirportIDs) AS unique
from airlineStats 
WHERE arraylength(DivAirportIDs) >= 2
limit 5
select DivAirportIDs, 
       arrayIndexOfInt(DivAirportIDs, 14683) AS index
from airlineStats 
WHERE arraylength(DivAirportIDs) >= 2
limit 5
select DivAirportIDs, 
       arrayReverseInt(DivAirportIDs) AS reversedIds
from airlineStats 
WHERE arraylength(DivAirportIDs) >= 2
limit 5
select FlightNum, 
       arraySliceInt(DivAirportIDs, 0, 1) AS airports, 
	     DivAirportIDs
from airlineStats 
WHERE arraylength(DivAirportIDs) >= 2
limit 5
select DivWheelsOffs, 
       DivWheelsOns,
       arrayUnionInt(DivWheelsOffs, DivWheelsOns) AS unionIds
from airlineStats 
WHERE arraylength(DivWheelsOffs) >= 2
limit 5
select COUNTMV(DivTailNums) AS value
from airlineStats 
where arraylength(DivTailNums) > 1
select day(1639351800000, 'CET') AS day
FROM ignoreMe
select dayOfMonth(1639351800000) AS day
FROM ignoreMe
select dayOfMonth(1639351800000, 'CET') AS day
FROM ignoreMe
select DISTINCT league AS value
from baseballStats 
select DISTINCT(league) AS value
from baseballStats 
select DISTINCTCOUNTHLL(teamID, 12) AS value
from baseballStats 
select DISTINCTCOUNTBITMAPMV(DivTailNums) AS value
from airlineStats 
where arraylength(DivTailNums) > 1
select DISTINCTCOUNTMV(DivTailNums) AS value
from airlineStats 
where arraylength(DivTailNums) > 1
select FromEpochSecondsBucket(1613472303, 1) AS bucket
FROM ignoreMe
select FromEpochSecondsBucket(1613472303, 2) AS bucket
FROM ignoreMe
select FromEpochMinutesBucket(2689120, 10) AS bucket
FROM ignoreMe
select FromEpochHoursBucket(89637, 5) AS bucket
FROM ignoreMe
select FromEpochDaysBucket(1867, 10) AS bucket
FROM ignoreMe
select id, repo, JSONEXTRACTKEY(repo, '$.*') AS keys
from githubEvents 
WHERE id = 7044874109
select DivTailNums, 
       arrayDistinctString(DivTailNums) AS unique
from airlineStats 
WHERE arraylength(DivTailNums) >= 2
limit 5
select RandomAirports, 
       arrayRemoveString(RandomAirports, 'SEA') AS value
from airlineStats 
WHERE arraylength(RandomAirports) BETWEEN 2 AND 4
limit 5
select DivAirportIDs, 
       arraySortInt(DivAirportIDs) AS sortedIds
from airlineStats 
WHERE arraylength(DivAirportIDs) >= 2
limit 5
select dayOfYear(1639351800000, 'CET') AS dayOfYear
FROM ignoreMe
select doy(1639351800000) AS dayOfYear
FROM ignoreMe
select doy(1639351800000, 'CET') AS dayOfYear
FROM ignoreMe
select dayOfWeek(1639351800000, 'CET') AS dayOfWeek
FROM ignoreMe
select dow(1639351800000) AS dayOfWeek
FROM ignoreMe
select dow(1639351800000, 'CET') AS dayOfWeek
FROM ignoreMe
select DISTINCTCOUNTBITMAP(teamID) AS value
from baseballStats 
select DISTINCTCOUNTRAWHLL(teamID, 1) AS value
from baseballStats 
select DISTINCTCOUNTRAWHLLMV(DivAirports, 1) AS value
from airlineStats 
where arraylength(DivAirports) > 1
{
   "tableConfig":{
      "ingestionConfig":{
         "transformConfigs":[
            {
               "columnName":"data",
               "transformFunction":"JSONFORMAT(meta)"
            }
         ]
      }
   }
}
SELECT FromDateTime('2019-08-07', 'yyyy-MM-dd') AS epochMillis
FROM ignoreMe
SELECT FromDateTime(
    '2019-08-07 3:12:13 PM', 
    'yyyy-MM-dd hh:mm:ss a'
    ) AS epochMillis
FROM ignoreMe
SELECT FromDateTime(
    '2019-08-07T15:12:13', 
    'yyyy-MM-dd''T''HH:mm:ss'
    ) AS epochMillis
FROM ignoreMe
SELECT FromDateTime(
    '2019-08-07T07:12:13-0800', 
    'yyyy-MM-dd''T''HH:mm:ssZ'
    ) AS epochMillis
FROM ignoreMe
select percentile(homeRuns, 80) AS value
from baseballStats 
select percentile(homeRuns, 99.9) AS value
from baseballStats 
select percentileest(homeRuns, 80) AS value
from baseballStats 
select percentileest(homeRuns, 99.9) AS value
from baseballStats 
select quarter(1633046399000, 'UTC') AS quarter
FROM ignoreMe
select quarter(1633046399000, 'CET') AS quarter
FROM ignoreMe
select DivAirportIDs, 
       arrayRemoveInt(DivAirportIDs, 12892) AS value
from airlineStats 
WHERE arraylength(DivAirportIDs) >= 2
AND arrayContainsInt(DivAirportIDs, 12892) = 1
limit 5
hashtag
Usage Examples

These examples are based on the Hybrid Quick Start.

FlightNum
reversedAirports
RandomAirports

1206

PSC,SEA

SEA,PSC

5300

PSC,SEA

SEA,PSC

3359

MSY,PHX,PSC,SEA

SEA,PSC,PHX,MSY

hashtag
Usage Examples

These examples are based on the Hybrid Quick Start.

FlightNum
sortedAirports
RandomAirports

3846

PSC,SEA

SEA,PSC

3635

MSY,PHX,PSC,SEA

SEA,PSC,PHX,MSY

429

MSY,PHX,PSC,SEA

SEA,PSC,PHX,MSY

hashtag
Usage Examples

These examples are based on the Hybrid Quick Start.

DivTailNums
DivAirports
unionIds

N7713A,N7713A

IND,IND

N7713A,IND

N344AA,N344AA

MCI,BOS

N344AA,MCI,BOS

N7713A,N7713A

IND,IND

N7713A,IND

arraySliceString

This section contains reference documentation for the arraySliceString function.

Returns the values in the array between the start and end positions.

hashtag
Signature

arraySliceString('colName', start, end)

hashtag
Usage Examples

These examples are based on the .

FlightNum
airports
RandomAirports

mult

This section contains reference documentation for the MULT function.

Product of at least two values

hashtag
Signature

MULT(col1, col2, col3...)

hashtag
Usage Examples

These examples are based on the .

homeRuns
baseOnBalls
total

max

This section contains reference documentation for the max function.

Get the maximum value in a group

hashtag
Signature

MAX(colName)

hashtag
Usage Examples

These examples are based on the .

value

percentilemv

This section contains reference documentation for the PERCENTILEMV function.

Returns the Nth percentile of the group where N is a decimal number between 0 and 100 inclusive

hashtag
Signature

PERCENTILEMV(colName, N)

hashtag
Usage Examples

These examples are based on the .

value
value
value

arrayConcatString

This section contains reference documentation for the arrayConcatString function.

Concatenates two arrays of strings.

hashtag
Signature

arrayConcatString('colName1', 'colName2')

hashtag
Usage Examples

These examples are based on the .

DivTailNums
concatIds

percentiletdigest

This section contains reference documentation for the PERCENTILETDigest function.

Returns the Nth percentile of the group using T-digest algorithmarrow-up-right.

hashtag
Signature

PERCENTILETDigest(colName, percentile)

hashtag
Usage Examples

These examples are based on the .

value
value
value

millisecond

This section contains reference documentation for the millisecond function.

Returns the millisecond of the second from the given epoch millis in UTC or specified timezone. The value ranges from 0 to 999.

hashtag
Signature

millisecond(tsInMillis)

millisecond(tsInMillis, timeZoneId)

hashtag
Usage Examples

millisecond
millisecond

sqrt

This section contains reference documentation for the sqrt function.

Square root of a value

hashtag
Signature

SQRT(col1)

hashtag
Usage Examples

value

SEGMENTPARTITIONEDDISTINCTCOUNT

This section contains reference documentation for the SEGMENTPARTITIONEDDISTINCTCOUNT function.

Returns the count of distinct values of a column when the column is pre-partitioned for each segment, where there is no common value within different segments. This function calculates the exact count of distinct values within the segment, then simply sums up the results from different segments to get the final result.

circle-exclamation

This function relies on the expression values being partitioned for each segment, where there are no common values within different segments.

hashtag
Signature

SEGMENTPARTITIONEDDISTINCTCOUNT(colName)

hashtag
Usage Examples

These examples are based on the .

value

second

This section contains reference documentation for the second function.

Returns the second of the minute from the given epoch millis in UTC or specified timezone. The value ranges from 0 to 59.

hashtag
Signature

second(tsInMillis)

second(tsInMillis, timeZoneId)

hashtag
Usage Examples

second
second

sha512

This section contains reference documentation for the SHA512 function.

Return SHA-512 digest of binary column(bytes type) as hex string

hashtag
Signature

SHA512(bytesCol)

hashtag
Usage Examples

These examples are based on the .

event_id
location
hash
circle-info

The row returned will be different if you run this example as the data is ingested in real-time.

mode

This section contains reference documentation for the mode function.

Get the most frequent value in a group. When multiple modes are present it gives the minimum of all the modes. This behavior can be overridden to get the maximum or the average mode.

hashtag
Signature

MODE(colName, [reducerType])

JSONPATHLONG

This section contains reference documentation for the JSONPATHLONG function.

Extracts the Long value from jsonField based on 'jsonPath', use optional defaultValuefor null or parsing error. This function can only be used in an .

hashtag
Signature

percentileestmv

This section contains reference documentation for the PERCENTILEESTMV function.

Returns the Nth percentile of the group using algorithm.

hashtag
Signature

PERCENTILEESTMV(colName, N)

percentiletdigestmv

This section contains reference documentation for the PERCENTILETDIGESTMV function.

Returns the Nth percentile of the group using .

hashtag
Signature

PERCENTILETDIGESTMV(colName, N)

ARRAYLENGTH

This section contains reference documentation for the ARRAYLENGTH function.

Returns the length of a multi-value column

hashtag
Signature

ARRAYLENGTH('colName')

FromEpoch

This section contains reference documentation for the fromEpoch functions.

Convert epoch to epoch milliseconds. The following time units are supported:

  • SECONDS

  • MINUTES

sha256

This section contains reference documentation for the SHA256 function.

Return SHA-256 digest of binary column(bytes type) as hex string

hashtag
Signature

SHA256(bytesCol)

select FlightNum, 
       arrayReverseString(RandomAirports) AS reversedAirports, 
       RandomAirports
from airlineStats 
WHERE arraylength(RandomAirports) BETWEEN 2 AND 4
limit 5
select FlightNum, 
       arraySortString(RandomAirports) AS sortedAirports, 
       RandomAirports
from airlineStats 
WHERE arraylength(RandomAirports) BETWEEN 2 AND 4
limit 5
select DivTailNums, 
       DivAirports,
       arrayUnionString(DivTailNums, DivAirports) AS unionIds
from airlineStats 
WHERE arraylength(DivTailNums) >= 2
limit 5

1023

PHX,PSC,SEA

SEA,PSC,PHX

963

MSY,PHX,PSC,SEA

SEA,PSC,PHX,MSY

1206

PSC,SEA

SEA,PSC

5300

PSC,SEA

SEA,PSC

N344AA,N344AA

MCI,BOS

N344AA,MCI,BOS

Functions

This page contains reference documentation for functions in Apache Pinot.

This page contains reference documentation for functions in Apache Pinot.

ABSchevron-rightADDchevron-rightarrayContainsIntchevron-rightarrayContainsStringchevron-rightarrayDistinctIntchevron-rightarrayDistinctStringchevron-rightarrayIndexOfIntchevron-rightarrayIndexOfStringchevron-rightARRAYLENGTHchevron-rightarrayRemoveIntchevron-rightarrayRemoveStringchevron-rightarrayReverseIntchevron-rightarrayReverseStringchevron-rightarraySliceIntchevron-rightarraySliceStringchevron-rightarraySortIntchevron-rightarraySortStringchevron-rightarrayUnionIntchevron-rightarrayUnionStringchevron-rightAVGMVchevron-rightceilchevron-rightCHRchevron-rightcodepointchevron-rightconcatchevron-rightcountchevron-rightCOUNTMVchevron-rightDATETIMECONVERTchevron-rightDATETRUNCchevron-rightdaychevron-rightdayOfWeekchevron-rightdayOfYearchevron-rightDISTINCTchevron-rightDISTINCTCOUNTchevron-rightDISTINCTCOUNTBITMAPchevron-rightDISTINCTCOUNTBITMAPMVchevron-rightDISTINCTCOUNTHLLchevron-rightDISTINCTCOUNTHLLMVchevron-rightDISTINCTCOUNTMVchevron-rightDISTINCTCOUNTRAWHLLchevron-rightDISTINCTCOUNTRAWHLLMVchevron-rightDISTINCTCOUNTRAWTHETASKETCHchevron-rightDISTINCTCOUNTTHETASKETCHchevron-rightDIVchevron-rightexpchevron-rightFromDateTimechevron-rightFLOORchevron-rightFromEpochchevron-rightFromEpochBucketchevron-righthourchevron-rightjsonextractkeychevron-rightjsonextractscalarchevron-rightJSONFORMATchevron-rightJSONPATHchevron-rightJSONPATHARRAYchevron-rightJSONPATHARRAYDEFAULTEMPTYchevron-rightJSONPATHDOUBLEchevron-rightJSONPATHLONGchevron-rightJSONPATHSTRINGchevron-rightlengthchevron-rightlnchevron-rightlowerchevron-rightlpadchevron-rightltrimchevron-rightmaxchevron-rightMAXMVchevron-rightMD5chevron-rightmillisecondchevron-rightminchevron-rightminmaxrangechevron-rightMINMAXRANGEMVchevron-rightMINMVchevron-rightminutechevron-rightMODchevron-rightmodechevron-rightmonthchevron-rightmultchevron-rightnowchevron-rightpercentilechevron-rightpercentileestchevron-rightpercentileestmvchevron-rightpercentiletdigestchevron-rightpercentiletdigestmvchevron-rightpercentilemvchevron-rightquarterchevron-rightregexpExtractchevron-rightremovechevron-rightreplacechevron-rightreversechevron-rightroundchevron-rightrpadchevron-rightrtrimchevron-rightsecondchevron-rightSEGMENTPARTITIONEDDISTINCTCOUNTchevron-rightshachevron-rightsha256chevron-rightsha512chevron-rightsqrtchevron-rightstartswithchevron-rightST_AsBinarychevron-rightST_AsTextchevron-rightST_Containschevron-rightST_Distancechevron-rightST_GeogFromTextchevron-rightST_GeogFromWKBchevron-rightST_GeometryTypechevron-rightST_GeomFromTextchevron-rightST_GeomFromWKBchevron-rightSTPOINTchevron-rightST_Polygonchevron-rightstrposchevron-rightST_Unionchevron-rightSUBchevron-rightsubstrchevron-rightsumchevron-rightTIMECONVERTchevron-righttimezoneHourchevron-righttimezoneMinutechevron-rightToDateTimechevron-rightToEpochchevron-rightToEpochBucketchevron-rightToEpochRoundedchevron-rightTOJSONMAPSTRchevron-righttoSphericalGeographychevron-righttrimchevron-rightupperchevron-rightVALUEINchevron-rightweekchevron-rightyearchevron-rightyearOfWeekchevron-right

SEA,PSC

SEA,PSC,PHX,MSY

3162

SEA,PSC

SEA,PSC,PHX,MSY

671

SEA,PSC

SEA,PSC,PHX,MSY

1767

SEA,PSC

SEA,PSC,PHX

2522

SEA,PSC

SEA,PSC

Hybrid Quick Start

424

select homeRuns, baseOnBalls, MULT(homeRuns, baseOnBalls) AS total
from baseballStats 
WHERE teamID = 'ML1' 
AND yearID = 1956 
AND playerName = 'Henry Louis'

26

37

962

Batch Quick Start
select max(homeRuns) AS value
from baseballStats 

73

Batch Quick Start
select PERCENTILEMV(DivLongestGTimes, 50) AS value
from airlineStats 
where arraylength(DivLongestGTimes) > 1

10

44

108

Hybrid Quick Start

N7713A,N7713A

N7713A,N7713A,N7713A,N7713A

N344AA,N344AA

N344AA,N344AA,N344AA,N344AA

N344AA,N344AA

N344AA,N344AA,N344AA,N344AA

N7713A,N7713A

N7713A,N7713A,N7713A,N7713A

Hybrid Quick Start
select PERCENTILETDigest(homeRuns, 50) AS value
from baseballStats 

0

3.6571905392487856

46.26787306220119

Batch Quick Start
select millisecond(1639351800000) AS millisecond
FROM ignoreMe

0

select millisecond(1639351800000, 'America/St_Johns') AS millisecond
FROM ignoreMe

0

select sqrt(25) AS value
from ignoreMe

5

select SEGMENTPARTITIONEDDISTINCTCOUNT(teamID) AS value
from baseballStats 

149

Batch Quick Start
select second(1639351800000) AS second
FROM ignoreMe

0

select second(1639351800000, 'America/St_Johns') AS second
FROM ignoreMe

0

select event_id, location, SHA512(location) AS hash
from meetupRsvp 
limit 1

282776561

80406178a3d70a3d714041d5c28f5c28f6

06cc4532755995fc1661f4195f3c67440471eba809a321635cda988a09e8bb66fd040713b1a88320bb70d6bd24443e5128527a178503e6c21d2c70438f02d103

Real time Quick Start
hashtag
Usage Examples

These examples are based on the Batch Quick Start.

value

2008

value

2010

value

2008

value

2012

select mode(yearID) AS value
from baseballStats 
WHERE AtBatting != 0 AND yearID > 2001
hashtag
Usage Examples

These examples are based on the Hybrid Quick Start.

value

10

value

44

value

108

Quantile Digestarrow-up-right
select PERCENTILEESTMV(DivLongestGTimes, 50) AS value
from airlineStats 
where arraylength(DivLongestGTimes) > 1
hashtag
Usage Examples

These examples are based on the Hybrid Quick Start.

value

10

value

44

value

108

T-digest algorithmarrow-up-right
select PERCENTILETDIGESTMV(DivLongestGTimes, 50) AS value
from airlineStats 
where arraylength(DivLongestGTimes) > 1
hashtag
Usage Examples

These examples are based on the Hybrid Quick Start.

length
count(*)

1

5382

37

267

33

223

17

166

22

160

circle-info

The count(*) values will increase each time we execute the query as data is constantly being ingested by the Hybrid Quick Start.

HOURS
  • DAYS

  • hashtag
    Signature

    FromEpoch<TIME_UNIT>(timeIn<Time_UNIT>)

    hashtag
    Usage Examples

    epochMillis

    1613472303000

    epochMillis

    1613472300000

    epochMillis

    1613469600000

    epochMillis

    1613433600000

    hashtag
    Usage Examples

    These examples are based on the Real time Quick Start.

    event_id
    location
    hash

    282776561

    80406178a3d70a3d714041d5c28f5c28f6

    e3cdf4be84d2c7e442693b0e2f98c39b80c862a9eaf0fd444fee2bd56c1d461b

    circle-info

    The row returned will be different if you run this example as the data is ingested in real-time.

    select event_id, location, SHA256(location) AS hash
    from meetupRsvp 
    limit 1
    select FlightNum, 
           arraySliceString(RandomAirports, 0, 2) AS airports, 
           RandomAirports
    from airlineStats 
    WHERE arraylength(RandomAirports) BETWEEN 2 AND 4
    limit 5
    select PERCENTILEMV(DivLongestGTimes, 90) AS value
    from airlineStats 
    where arraylength(DivLongestGTimes) > 1
    select PERCENTILEMV(DivLongestGTimes, 99.9) AS value
    from airlineStats 
    where arraylength(DivLongestGTimes) > 1
    select DivTailNums, 
           arrayConcatString(DivTailNums, DivTailNums) AS concatIds
    from airlineStats 
    WHERE arraylength(DivTailNums) >= 2
    limit 5
    select PERCENTILETDigest(homeRuns, 80) AS value
    from baseballStats 
    select PERCENTILETDigest(homeRuns, 99.9) AS value
    from baseballStats 
    select mode(yearID, 'AVG') AS value
    from baseballStats 
    WHERE AtBatting != 0 AND yearID > 2001
    select mode(yearID, 'MIN') AS value
    from baseballStats 
    WHERE AtBatting != 0 AND yearID > 2001
    select mode(yearID, 'MAX') AS value
    from baseballStats 
    WHERE AtBatting != 0 AND yearID > 2001
    select PERCENTILEESTMV(DivLongestGTimes, 90) AS value
    from airlineStats 
    where arraylength(DivLongestGTimes) > 1
    select PERCENTILEESTMV(DivLongestGTimes, 99.9) AS value
    from airlineStats 
    where arraylength(DivLongestGTimes) > 1
    select PERCENTILETDIGESTMV(DivLongestGTimes, 90) AS value
    from airlineStats 
    where arraylength(DivLongestGTimes) > 1
    select PERCENTILETDIGESTMV(DivLongestGTimes, 99.9) AS value
    from airlineStats 
    where arraylength(DivLongestGTimes) > 1
    select ARRAYLENGTH(RandomAirports) AS length, count(*) 
    from airlineStats 
    GROUP BY length
    ORDER BY count(*) DESC
    LIMIT 5
    select FromEpochSeconds(1613472303) AS epochMillis
    FROM ignoreMe
    select FromEpochMinutes(26891205) AS epochMillis
    FROM ignoreMe
    select FromEpochHours(448186) AS epochMillis
    FROM ignoreMe
    select FromEpochDays(18674) AS epochMillis
    FROM ignoreMe
    JSONPATHLONG(jsonField, 'jsonPath', [defaultValue])
    Arguments
    Description

    jsonField

    An Identifier/Expression contains JSON documents.

    'jsonPath'

    Follows to read values from JSON documents.

    circle-exclamation

    'jsonPath'` is a literal. Pinot uses single quotes to distinguish them from identifiers.

    hashtag
    Usage Examples

    The usage examples are based on extracting fields from the following JSON document:

    Expression
    Value

    JSONPATHLONG(data, '$.age')

    24

    This function can be used in the table config to extract the age property into the age column, as described below:

    ingestion transformation function

    DISTINCTCOUNTRAWTHETASKETCH

    This section contains reference documentation for the DISTINCTCOUNTRAWTHETASKETCH function.

    The Theta Sketcharrow-up-right framework enables set operations over a stream of data, and can also be used for cardinality estimation. Pinot leverages the Sketch Classarrow-up-right and its extensions from the library org.apache.datasketches:datasketches-java:1.2.0-incubating to perform distinct counting as well as evaluating set operations.

    hashtag
    Signature

    DISTINCTCOUNTRAWTHETASKETCH(<thetaSketchColumn>, <thetaSketchParams>, predicate1, predicate2..., postAggregationExpressionToEvaluate) -> HexEncoded

    • thetaSketchColumn (required): Name of the column to aggregate on.

    • thetaSketchParams (required): Parameters for constructing the intermediate theta-sketches.

    hashtag
    Usage Examples

    These examples are based on the .

    value
    value

    We can also provide predicates and a post aggregation expression to compute more complicated cardinalities:

    value

    JSONPATH

    This section contains reference documentation for the JSONPATH function.

    Extracts the object value from jsonField based on 'jsonPath', the result type is inferred based on JSON value. This function can only be used in an ingestion transformation function.

    hashtag
    Signature

    JSONPATH(jsonField, 'jsonPath')

    Arguments
    Description
    circle-exclamation

    'jsonPath'` is a literal. Pinot uses single quotes to distinguish them from identifiers.

    hashtag
    Usage Examples

    The usage examples are based on extracting fields from the following JSON document:

    Expression
    Value

    This function can be used in the to extract the name property into the name column and age property into the age column, as described below:

    DATETRUNC

    This section contains reference documentation for the DATETRUNC function.

    (Presto) SQL compatible date truncation, equivalent to the Presto function date_trunc.

    Converts the value into a specified output granularity seconds since UTC epoch that is bucketed on a unit in a specified timezone.

    hashtag
    Signature

    DATETRUNC(unit, timeValue)

    DATETRUNC(unit, timeValue, inputTimeUnitStr)

    DATETRUNC(unit, timeValue, inputTimeUnitStr, timeZone)

    DATETRUNC(unit, timeValue, inputTimeUnitStr, timeZone, outputTimeUnitStr)

    unit supports the following values:

    • millisecond

    • second

    • minute

    inputTimeUnitStr and outputTimeUnitStr support the following values:

    • NANOSECONDS

    • MICROSECONDS

    • MILLISECONDS

    hashtag
    Usage Examples

    Truncates an epoch in milliseconds at WEEK (where a Week starts at Monday UTC midnight):

    or

    ts

    Truncates an epoch in milliseconds at WEEK (where a Week starts at Monday UTC midnight) in the UTC time zone, returning a result in epoch in seconds in UTC timezone:

    ts

    Truncates an epoch in milliseconds at WEEK (where a Week starts at Monday UTC midnight) in the CET time zone, returning a result in epoch in seconds in UTC timezone:

    ts

    Truncates an epoch in milliseconds at QUARTER in the Los Angeles time zone (where a Quarter begins on Jan 1st, April 1st, July 1st, October 1st in Los Angeles timezone), returning a result in hours since UTC epoch:

    ts

    JSONPATHARRAYDEFAULTEMPTY

    This section contains reference documentation for the JSONPATHARRAYDEFAULTEMPTY function.

    Extracts an array from jsonField based on 'jsonPath', the result type is inferred based on JSON value. Returns empty array for null or parsing error. This function can only be used in an ingestion transformation function.

    hashtag
    Signature

    JSONPATHARRAYDEFAULTEMPTY(jsonField, 'jsonPath')

    Arguments
    Description
    circle-exclamation

    'jsonPath'` is a literal. Pinot uses single quotes to distinguish them from identifiers.

    hashtag
    Usage Examples

    The usage examples are based on extracting fields from the following JSON document:

    Expression
    Value

    This function can be used in the to extract the name, score, and second value of homework_grades into their respective columns , as described below:

    JSONPATHDOUBLE

    This section contains reference documentation for the JSONPATHDOUBLE function.

    Extracts the Double value from jsonField based on 'jsonPath', use optional defaultValuefor null or parsing error. This function can only be used in an ingestion transformation function.

    hashtag
    Signature

    JSONPATHDOUBLE(jsonField, 'jsonPath', [defaultValue])

    Arguments
    Description
    circle-exclamation

    'jsonPath'` is a literal. Pinot uses single quotes to distinguish them from identifiers.

    hashtag
    Usage Examples

    The usage examples are based on extracting fields from the following JSON document:

    Expression
    Value

    This function can be used in the to extract the age property into the age column, as described below:

    JSONPATHARRAY

    This section contains reference documentation for the JSONPATHARRAY function.

    Extracts an array from jsonField based on 'jsonPath', the result type is inferred based on JSON value. This function can only be used in an ingestion transformation function.

    hashtag
    Signature

    JSONPATHARRAY(jsonField, 'jsonPath')

    Arguments
    Description
    circle-exclamation

    'jsonPath'` is a literal. Pinot uses single quotes to distinguish them from identifiers.

    hashtag
    Usage Examples

    The usage examples are based on extracting fields from the following JSON document:

    Expression
    Value

    This function can be used in the to extract the name, score, and second value of homework_grades into their respective columns , as described below:

    regexpExtract

    This section contains reference documentation for the regexpExtract function.

    Extracts values that match the provided regular expression

    hashtag
    Signature

    regexpExtract(value, regexp)

    regexpExtract(value, regexp, group)

    regexpExtract(value, regexp, group, defaultValue)

    hashtag
    Usage Examples

    value
    value
    value
    value
    value
    value

    jsonextractscalar

    This section contains reference documentation for the JSONEXTRACTSCALAR function.

    Evaluates the 'jsonPath' on jsonField, returns the result as the type 'resultsType', use optional defaultValuefor null or parsing error.

    hashtag
    Signature

    JSONEXTRACTSCALAR(jsonField, 'jsonPath', 'resultsType', [defaultValue])

    JSONPATHSTRING

    This section contains reference documentation for the JSONPATHSTRING function.

    Extracts the String value from jsonField based on 'jsonPath', use optional defaultValuefor null or parsing error. This function can only be used in an .

    hashtag
    Signature

    DISTINCTCOUNTTHETASKETCH

    This section contains reference documentation for the DISTINCTCOUNTTHETASKETCH function.

    The framework enables set operations over a stream of data, and can also be used for cardinality estimation. Pinot leverages the and its extensions from the library org.apache.datasketches:datasketches-java:1.2.0-incubating to perform distinct counting as well as evaluating set operations.

    hashtag
    Signature

    {
      "data": {
        "name": "Pete",
        "age": 24,
        "subjects": [
          {
            "name": "maths",
            "homework_grades": [80, 85, 90, 95, 100],
            "grade": "A",
            "score": 90
          },
          {
            "name": "english",
            "homework_grades": [60, 65, 70, 85, 90],
            "grade": "B",
            "score": 70
          }
        ]
      }
    }
    {
       "tableConfig":{
          "ingestionConfig":{
             "transformConfigs":[
                {
                   "columnName":"age",
                   "transformFunction":"JSONPATHLONG(data, '$.age')"
                }
             ]
          }
       }
    }
    JsonPath Syntaxarrow-up-right
    Currently, the only supported parameter is nominalEntries (defaults to 4096).
  • predicates (optional)_: _ These are individual predicates of form lhs <op> rhs which are applied on rows selected by the where clause. During intermediate sketch aggregation, sketches from the thetaSketchColumn that satisfies these predicates are unionized individually. For example, all filtered rows that match country=USA are unionized into a single sketch. Complex predicates that are created by combining (AND/OR) of individual predicates is supported.

  • postAggregationExpressionToEvaluate (required): The set operation to perform on the individual intermediate sketches for each of the predicates. Currently supported operations are SET_DIFF, SET_UNION, SET_INTERSECT , where DIFF requires two arguments and the UNION/INTERSECT allow more than two arguments.

  • AgMDAAAKzJOVAAAAAACAPwDAATjfLK5fBJQy2rIU1GYLOK5a09G+XQ1UHWt00/NwFTC4EwzexhE3CHBSU+YIUzkM0goIADEeFViAmzCRcx5FeHrMHfGsU/qrFvMP+Q87UYRC7LFzZ0FV3PIfAF1FMFsM+E9XRwZRYoR79VdK7z1jAD/WClziDmb4Cosm3ctidcRl9VxfNTR47OUFqFP4dYQkZwXIEZtEhngdkGfqkQCKZPX85HITAZrwVDpI4TY6paDTZwLQNiemHFCUlEZCKcOMpkXuYypOxjzXi1ES+07IIH7EqrQeKcssHvOUh2gpzIDajYdQ4UTS6IBoXPB6AtbomPBiMalFURDzh+xppzrg5HcUTMW4Iuzgv5Mz/xIm73yOe7seghzwmH+zXUfda/mkaBqU6XQEAQFagTkndhYHHcjLb0OeQg4BGDAHtRIDD8EqsonkilQT6TZq2uM3CRXJQTlaYewzFvHsKivVomgcQRojVnPKBh0d0GgYeF4eIEXtD1bZTw43eVR1Dk6sBj3pjleOW21dRsUCRmyEDGdIfWQVJXouaUnZqaC9gi1oSrG7GT8HO2xXeb32OzfiHVx5s9+5bGpFXoXTu1n7g2Jone8JMyGuam2x7Bt55a1JdtFCFxhZ2Gd7IajHY4lNBH2lDfUoJed4f7kGUEXmlU6BCfwOkJ1CIoWBTQY+NToDhpmmmPY+rVOH5coybBHlH4vpfPBbbQsOjl0YBSC9uEmZ3WubqnV0KZ1p5d7wq/F0p7Wgo8y4JVXAobKCB+hsVckBNIA4XrYMzdWVSWeQsXHSuR+mWmJPftadyrMlfvoy2mVr8R4Dih7k3XNhXZwjBeuNJQA5Dtci6w0uIUczvEL+nY+9CSHEPQhuT//aluJ2De4Fk94cfWgaxqhYyh10TTIWZFmsDxJeOMaPT1BCwVRF6taOjftNbVDC5Fy1BtVzVIIUOGeBcj5VbhHtqowIB1qGEDIJy9ZBXD73iFBN5kVgvicaFGSKHGQqeIVsgOFdcFKITQTuV2d0pkljkPXKUIc68M0KPpU6iZYuaBA4+hGR9nri0tVnbJZOM1Z/fi01ou5YLYCoHTqkImozpJMYXLCqKtTBm2o7sc5oQATXUBC9dqM8xQoGL8OmltUWc1cX35rtD2D2zHL2IncEKMzsN/c6S31W74VTBtcbJfP9rHENp7yO453qYhA7m++jl2MKFzdvtkHqGDUcs9FKisV9Hx+ruhaGsLkdISszkZ3sYykjx3NH6BbbaCZf9jTswuxHKheTbaEDmSgrx7BfK+Z2My4jdMqCrEtKMSuJqEJ22AM5U8MNFVkCPTobkCEdJx0ZQJu+Tk73t1v3nqLUQH4PbFJzcUrr9yZFZ0u+1mzNNQ5o0w+v1dSRLGsXsPyRqGkQchuz/DKyrjJzf9Vb8HY4Ni63XiaXwgJrjq9rgAp6EmWV2xXUOI9CWZa7HsuRWO95m58nIq9K8VCkO+T/rWwrPqZ/tCgEtkshqecNhszQiki0d5Kf26o/YcATx4ZkJ655y4PTVr+kY0Xbb/UwEo2pPd3Hyd4hVz1I5N9TpYaJk2Lok1+7N+3LG+3Lj3KZtd5/+j8RujEmogI=

    AwMDAAAKzJMQAAAAAACAP4vpfPBbbQsO5N1zYV2cIwWFgU0GPjU6A4Z4HZBn6pEAyQE0gDhetgyKZPX85HITAQ4BGDAHtRIDEDub76OXYwoxK4moQnbYA9LogGhc8HoCE+k2atrjNwlVbhHtqowIBzd5VHUOTqwG+aRoGpTpdAT6PxG6MSaiAnshqMdjiU0EHEEaI1ZzygY=

    AQMDAAA6zJN8QPYIsvHMNQ==

    Batch Quick Start

    jsonField

    An Identifier/Expression contains JSON documents.

    'jsonPath'

    Follows JsonPath Syntaxarrow-up-right to read values from JSON documents.

    JSONPATH(data, '$.name')

    "Pete"

    JSONPATH(data, '$.age')

    24

    table config
    hour
  • day

  • week

  • month

  • quarter

  • year

  • SECONDS
  • MINUTES

  • HOURS

  • DAYS

  • 1639353600000

    1639353600

    1639350000

    453631

    jsonField

    An Identifier/Expression contains JSON documents.

    'jsonPath'

    Follows JsonPath Syntaxarrow-up-right to read values from JSON documents.

    JSONPATHARRAYDEFAULTEMPTY(myJsonRecord, '$.subjects[*].name')

    ["maths", "english"]

    JSONPATHARRAYDEFAULTEMPTY(myJsonRecord, '$.subjects[*].score')

    [90, 70]

    JSONPATHARRAYDEFAULTEMPTY(myJsonRecord, '$.subjects[*].homework_grades[1]')

    [85, 65]

    JSONPATHARRAYDEFAULTEMPTY(myJsonRecord, '$.subjects[*].homework_grades[7]')

    []

    table config

    jsonField

    An Identifier/Expression contains JSON documents.

    'jsonPath'

    Follows JsonPath Syntaxarrow-up-right to read values from JSON documents.

    JSONPATHDOUBLE(data, '$.age')

    24.0

    table config

    jsonField

    An Identifier/Expression contains JSON documents.

    'jsonPath'

    Follows JsonPath Syntaxarrow-up-right to read values from JSON documents.

    JSONPATHARRAY(myJsonRecord, '$.subjects[*].name')

    ["maths", "english"]

    JSONPATHARRAY(myJsonRecord, '$.subjects[*].score')

    [90, 70]

    JSONPATHARRAY(myJsonRecord, '$.subjects[*].homework_grades[1]')

    [85, 65]

    table config

    foo

    123

    foo

    foo bar

    bar

    defaultValue

    DistinctCountThetaSketch(
    <thetaSketchColumn>, <thetaSketchParams>, predicate1, predicate2..., postAggregationExpressionToEvaluate
    ) -> Long
    • thetaSketchColumn (required): Name of the column to aggregate on.

    • thetaSketchParams (required): Parameters for constructing the intermediate theta-sketches.

      • Currently, the only supported parameter is nominalEntries (defaults to 4096).

    • predicates (optional)_: _ These are individual predicates of form lhs <op> rhs which are applied on rows selected by the where clause. During intermediate sketch aggregation, sketches from the thetaSketchColumn that satisfies these predicates are unionized individually. For example, all filtered rows that match country=USA are unionized into a single sketch. Complex predicates that are created by combining (AND/OR) of individual predicates is supported.

    • postAggregationExpressionToEvaluate (required): The set operation to perform on the individual intermediate sketches for each of the predicates. Currently supported operations are SET_DIFF, SET_UNION, SET_INTERSECT , where DIFF requires two arguments and the UNION/INTERSECT allow more than two arguments.

    hashtag
    Usage Examples

    These examples are based on the Batch Quick Start.

    value

    149

    value

    146

    We can also provide predicates and a post aggregation expression to compute more complicated cardinalities. For example, we could can find the intersection of the following queries:

    yearID

    1986

    1985

    yearID

    1937

    2003

    1979

    1900

    1986

    1978

    2012

    (the yearId 1986 is the only one in common)

    By running the following query:

    value

    1

    Theta Sketcharrow-up-right
    Sketch Classarrow-up-right
    select distinctCountRawThetaSketch(teamID) AS value
    from baseballStats 
    select distinctCountRawThetaSketch(teamID, 'nominalEntries=10') AS value
    from baseballStats
    select distinctCountRawThetaSketch(
      yearID, 
      'nominalEntries=4096', 
      'teamID = ''SFN'' AND numberOfGames=28 AND homeRuns=1',
      'teamID = ''CHN'' AND numberOfGames=28 AND homeRuns=1',
      'SET_INTERSECT($1, $2)'
    ) AS value
    from baseballStats 
    {
      "data": {
        "name": "Pete",
        "age": 24,
        "subjects": [
          {
            "name": "maths",
            "homework_grades": [80, 85, 90, 95, 100],
            "grade": "A",
            "score": 90
          },
          {
            "name": "english",
            "homework_grades": [60, 65, 70, 85, 90],
            "grade": "B",
            "score": 70
          }
        ]
      }
    }
    {
       "tableConfig":{
          "ingestionConfig":{
             "transformConfigs":[
                {
                   "columnName":"name",
                   "transformFunction":"JSONPATHSTRING(data, '$.name')"
                },
                {
                   "columnName":"age",
                   "transformFunction":"JSONPATHSTRING(data, '$.age')"
                }
             ]
          }
       }
    }
    select dateTrunc('week', 1639480981746) AS ts
    FROM ignoreMe
    select dateTrunc('week', 1639480981746, 'MILLISECONDS') AS ts
    FROM ignoreMe
    select dateTrunc(
      'week', 
      1639480981746, 
      'MILLISECONDS', 
      'UTC', 
      'SECONDS'
    ) AS ts
    FROM ignoreMe
    select dateTrunc(
      'week', 
      1639480981746, 
      'MILLISECONDS', 
      'CET', 
      'SECONDS'
    ) AS ts
    FROM ignoreMe
    select dateTrunc(
      'quarter', 
      1639480981746, 
      'MILLISECONDS', 
      'America/Los_Angeles', 
      'HOURS'
    ) AS ts
    FROM ignoreMe
    {
      "data": {
        "name": "Pete",
        "age": 24,
        "subjects": [
          {
            "name": "maths",
            "homework_grades": [80, 85, 90, 95, 100],
            "grade": "A",
            "score": 90
          },
          {
            "name": "english",
            "homework_grades": [60, 65, 70, 85, 90],
            "grade": "B",
            "score": 70
          }
        ]
      }
    }
    {
       "tableConfig":{
          "ingestionConfig":{
             "transformConfigs":[
                {
                   "columnName":"names",
                   "transformFunction":"JSONPATHARRAYDEFAULTEMPTY(data, '$.subjects[*].name')"
                },
                {
                   "columnName":"ages",
                   "transformFunction":"JSONPATHARRAYDEFAULTEMPTY(data, '$.subjects[*].score')"
                },
                {
                   "columnName":"homeworkGrades",
                   "transformFunction":"JSONPATHARRAYDEFAULTEMPTY(data, '$.subjects[*].homework_grades[1]')"
                }
             ]
          }
       }
    }
    {
      "data": {
        "name": "Pete",
        "age": 24,
        "subjects": [
          {
            "name": "maths",
            "homework_grades": [80, 85, 90, 95, 100],
            "grade": "A",
            "score": 90
          },
          {
            "name": "english",
            "homework_grades": [60, 65, 70, 85, 90],
            "grade": "B",
            "score": 70
          }
        ]
      }
    }
    {
       "tableConfig":{
          "ingestionConfig":{
             "transformConfigs":[
                {
                   "columnName":"age",
                   "transformFunction":"JSONPATHDOUBLE(data, '$.age')"
                }
             ]
          }
       }
    }
    {
      "data": {
        "name": "Pete",
        "age": 24,
        "subjects": [
          {
            "name": "maths",
            "homework_grades": [80, 85, 90, 95, 100],
            "grade": "A",
            "score": 90
          },
          {
            "name": "english",
            "homework_grades": [60, 65, 70, 85, 90],
            "grade": "B",
            "score": 70
          }
        ]
      }
    }
    {
       "tableConfig":{
          "ingestionConfig":{
             "transformConfigs":[
                {
                   "columnName":"names",
                   "transformFunction":"JSONPATHARRAY(data, '$.subjects[*].name')"
                },
                {
                   "columnName":"ages",
                   "transformFunction":"JSONPATHARRAY(data, '$.subjects[*].score')"
                },
                {
                   "columnName":"homeworkGrades",
                   "transformFunction":"JSONPATHARRAY(data, '$.subjects[*].homework_grades[1]')"
                }
             ]
          }
       }
    }
    select regexpExtract('foo', '.*') AS value
    from ignoreMe
    select regexpExtract('foo123', '[0-9]+') AS value
    from ignoreMe
    select regexpExtract('foo123', '[^0-9]+') AS value
    from ignoreMe
    select regexpExtract('foo bar baz', '(\w+) (\w+)', 0) AS value
    from ignoreMe
    select regexpExtract('foo bar baz', '(\w+) (\w+)', 2) AS value
    from ignoreMe
    select regexpExtract('foo123', 'bar', 0, 'defaultValue') AS value
    from ignoreMe
    select distinctCountThetaSketch(teamID) AS value
    from baseballStats 
    select distinctCountThetaSketch(teamID, 'nominalEntries=10') AS value
    from baseballStats
    select yearID
    from baseballStats
    where teamID = 'SFN' AND numberOfGames = 28 AND homeRuns = 1
    select yearID
    from baseballStats
    where teamID = 'CHN' AND numberOfGames = 28 AND homeRuns = 1
    select distinctCountThetaSketch(
      yearID, 
      'nominalEntries=4096', 
      'teamID = ''SFN'' AND numberOfGames=28 AND homeRuns=1',
      'teamID = ''CHN'' AND numberOfGames=28 AND homeRuns=1',
      'SET_INTERSECT($1, $2)'
    ) AS value
    from baseballStats 
    Arguments
    Description

    jsonField

    An Identifier/Expression contains JSON documents.

    'jsonPath'

    Follows to read values from JSON documents.

    'results_type'

    One of the Pinot supported data types:INT, LONG, FLOAT, DOUBLE, BOOLEAN, TIMESTAMP, STRING,

    INT_ARRAY, LONG_ARRAY, FLOAT_ARRAY, DOUBLE_ARRAY, STRING_ARRAY.

    circle-exclamation

    'jsonPath'and`` ``'results_type'are literals. Pinot uses single quotes to distinguish them from identifiers.

    hashtag
    Usage Examples

    The examples in this section are based on the Batch JSON Quick Start. In particular we'll be querying the row WHERE id = 7044874109:

    repo

    {"id":115911530,"name":"LimeVista/Tapes","url":"https://api.github.com/repos/LimeVista/Tapes"}

    The following examples show how to use the JSONEXTRACTSCALAR function:

    id
    name

    7044874109

    LimeVista/Tapes

    id
    name

    7044874109

    dummyValue

    JSONPATHSTRING(jsonField, 'jsonPath', [defaultValue])
    Arguments
    Description

    jsonField

    An Identifier/Expression contains JSON documents.

    'jsonPath'

    Follows to read values from JSON documents.

    circle-exclamation

    'jsonPath'` is a literal. Pinot uses single quotes to distinguish them from identifiers.

    hashtag
    Usage Examples

    The usage examples are based on extracting fields from the following JSON document:

    Expression
    Value

    JSONPATHSTRING(data, '$.age')

    "24"

    This function can be used in the table config to extract the age property into the age column, as described below:

    ingestion transformation function

    DATETIMECONVERT

    This section contains reference documentation for the DATETIMECONVERT function.

    Converts the value from a column that contains an epoch timestamp into another time unit and buckets based on the given time granularity.

    hashtag
    Signature

    DATETIMECONVERT(columnName, inputFormat, outputFormat, outputGranularity)

    inputFormat and outputFormat are defined using the following structure:

    <time size>:<time unit>:<time format>:<pattern>

    where:

    • time size - size of the time unit eg: 1, 10

    • time unit - DAYS, HOURS, MINUTES, SECONDS, MILLISECONDS

    granularity is specified in the format <time size>:<time unit>.

    hashtag
    Usage Examples

    These examples are based on the .

    created_at_timestamp from milliseconds since epoch to days since epoch, bucketed to 1 day granularity:

    id
    created_at_timestamp
    timeInMs
    convertedTime

    created_at_timestamp bucketed to 15 minutes granularity:

    id
    created_at_timestamp
    timeInMs
    convertedTime

    created_at_timestamp to format yyyy-MM-dd, bucketed to 1 days granularity:

    id
    created_at_timestamp
    timeInMs
    convertedTime

    created_at_timestamp to format yyyy-MM-dd HH:mm, in timezone Pacific/Kiritimati:

    id
    created_at_timestamp
    timeInMs
    convertedTime

    created_at_timestamp to format yyyy-MM-dd, in timezone Pacific/Kiritimati and bucketed to 1 day granularity:

    id
    created_at_timestamp
    timeInMs
    convertedTime
    select repo
    from githubEvents 
    WHERE id = 7044874109
    select id, jsonextractscalar(repo, '$.name', 'STRING') AS name
    from githubEvents 
    WHERE id = 7044874109
    select id, jsonextractscalar(repo, '$.foo', 'STRING') AS name
    from githubEvents 
    WHERE id = 7044874109
    [
      {
        "message": "QueryExecutionError:\njava.lang.RuntimeException: Illegal Json Path: [$.foo], when reading [{\"id\":115911530,\"name\":\"LimeVista/Tapes\",\"url\":\"https://api.github.com/repos/LimeVista/Tapes\"}]\n\tat org.apache.pinot.core.operator.transform.function.JsonExtractScalarTransformFunction.transformToStringValuesSV(JsonExtractScalarTransformFunction.java:254)\n\tat org.apache.pinot.core.operator.docvalsets.TransformBlockValSet.getStringValuesSV(TransformBlockValSet.java:90)\n\tat org.apache.pinot.core.common.RowBasedBlockValueFetcher.createFetcher(RowBasedBlockValueFetcher.java:64)\n\tat org.apache.pinot.core.common.RowBasedBlockValueFetcher.<init>(RowBasedBlockValueFetcher.java:32)",
        "errorCode": 200
      }
    ]
    select id, jsonextractscalar(repo, '$.foo', 'STRING', 'dummyValue') AS name
    from githubEvents 
    WHERE id = 7044874109
    {
      "data": {
        "name": "Pete",
        "age": 24,
        "subjects": [
          {
            "name": "maths",
            "homework_grades": [80, 85, 90, 95, 100],
            "grade": "A",
            "score": 90
          },
          {
            "name": "english",
            "homework_grades": [60, 65, 70, 85, 90],
            "grade": "B",
            "score": 70
          }
        ]
      }
    }
    {
       "tableConfig":{
          "ingestionConfig":{
             "transformConfigs":[
                {
                   "columnName":"age",
                   "transformFunction":"JSONPATHSTRING(data, '$.age')"
                }
             ]
          }
       }
    }
    JsonPath Syntaxarrow-up-right
    JsonPath Syntaxarrow-up-right
    ,
    MICROSECONDS
    ,
    NANOSECONDS
  • time format

    • EPOCH

    • SIMPLE_DATE_FORMAT pattern - defined in case of SIMPLE_DATE_FORMAT e.g. yyyy-MM-dd. A specific timezone can be passed using tz(timezone). Timezone can be long or short string format timezone. e.g. Asia/Kolkata or PDT

  • 7044874109

    2018-01-01 11:00:00.0

    1514804402000

    17532

    7044874109

    2018-01-01 11:00:00.0

    1514804402000

    1514804400000

    7044874109

    2018-01-01 11:00:00.0

    1514804402000

    2018-01-01

    7044874109

    2018-01-01 11:00:00.0

    1514804402000

    2018-01-02 01:00

    7044874109

    2018-01-01 11:00:00.0

    1514804402000

    2018-01-02 00:00

    Batch JSON Quick Start
    select id, 
           created_at_timestamp, 
           cast(created_at_timestamp AS long) AS timeInMs,
           DATETIMECONVERT(
             created_at_timestamp, 
             '1:MILLISECONDS:EPOCH', 
             '1:DAYS:EPOCH', 
             '1:DAYS'
           ) AS convertedTime
    from githubEvents
    WHERE id = 7044874134
    select id, 
           created_at_timestamp, 
           cast(created_at_timestamp AS long) AS timeInMs,
           DATETIMECONVERT(
             created_at_timestamp, 
             '1:MILLISECONDS:EPOCH', 
             '1:MILLISECONDS:EPOCH', 
             '15:MINUTES'
           ) AS convertedTime
    from githubEvents
    WHERE id = 7044874134
    select id, 
           created_at_timestamp, 
           cast(created_at_timestamp AS long) AS timeInMs,
           DATETIMECONVERT(
             created_at_timestamp, 
             '1:MILLISECONDS:EPOCH', 
             '1:DAYS:SIMPLE_DATE_FORMAT:yyyy-MM-dd', 
             '1:DAYS'
           ) AS convertedTime
    from githubEvents
    WHERE id = 7044874134
    select id, 
           created_at_timestamp, 
           cast(created_at_timestamp AS long) AS timeInMs,
           DATETIMECONVERT(
             created_at_timestamp, 
             '1:MILLISECONDS:EPOCH', 
             '1:MILLISECONDS:SIMPLE_DATE_FORMAT:yyyy-MM-dd HH:mm tz(Pacific/Kiritimati)', 
             '1:MILLISECONDS'
           ) AS convertedTime
    from githubEvents
    WHERE id = 7044874134
    select id, 
           created_at_timestamp, 
           cast(created_at_timestamp AS long) AS timeInMs,
           DATETIMECONVERT(
             created_at_timestamp, 
             '1:MILLISECONDS:EPOCH', 
             '1:MILLISECONDS:SIMPLE_DATE_FORMAT:yyyy-MM-dd HH:mm tz(Pacific/Kiritimati)', 
             '1:DAYS'
           ) AS convertedTime
    from githubEvents
    WHERE id = 7044874134