arrow-left

All pages
gitbookPowered by GitBook
1 of 1

Loading...

Window functions

Use window functions to compute averages, sort, rank, or count items, calculate sums, and find minimum or maximum values.

circle-info

Important: To query using Windows functions, you must use Pinot's multi-stage query engine (v2).

Use window functions to do the following:

hashtag
Window functions overview

  • Window functions consist of a supported function and an .

  • To learn more about syntax, see , and read about the .

  • To see Windows functions you can use in Pinot, see .

hashtag
Window function query syntax

For information about the Window function query syntax, see .

hashtag
Example query layout

The following query shows the complete components of the window function. Note, PARTITION BY and ORDER BY are optional.

hashtag
OVER clause

hashtag
Partition by clause

  • If a PARTITION BY clause is specified, the intermediate results will be grouped into different partitions based on the values of the columns appearing in the PARTITION BY clause.

  • If the PARTITION BY clause isn’t specified, the whole result will be regarded as one big partition, i.e. there is only one partition in the result set.

hashtag
Order by clause

  • If an ORDER BY clause is specified, all the rows within the same partition will be sorted based on the values of the columns appearing in the window ORDER BY clause. The ORDER BY clause decides the order in which the rows within a partition are to be processed.

  • If no ORDER BY clause is specified while a PARTITION BY clause is specified, the order of the rows is undefined. If output ordering is desired a global ORDER BY clause should be used in the query.

hashtag
Frame clause

  • {RANGE|ROWS} frame_start OR

  • {RANGE|ROWS} BETWEEN frame_start AND frame_end; frame_start and frame_end can be any of:

    • UNBOUNDED PRECEDING: expression PRECEDING. May only be allowed in ROWS mode [depends on DB, some support some don’t]

If there is no FRAME, no PARTITION BY, and no ORDER BY clause specified in the OVER clause (empty OVER), the whole result set is regarded as one partition, and there's one frame in the window.

The OVER clause applies a specified to compute values over a group of rows, and return a single result for each row. The OVER clause specifies how the rows are arranged and how the aggregation is done on those rows.

Inside the over clause, there are three optional components, i.e. PARTITION BY clause, ORDER BY clause, and FRAME clause.

hashtag
Supported Pinot window functions

Function
Description
Example
Default Value When No Record Selected

hashtag
Examples of windows functions

hashtag
Sum transactions by customer ID

Calculate the rolling sum transaction amount ordered by the payment date for each customer ID (note, the default frame here is UNBOUNDED PRECEDING and CURRENT ROW).

customer_id
payment_date
amount
sum

hashtag
Find the minimum or maximum transaction by customer ID

Calculate the least (use MIN()) or most expensive (use MAX()) transaction made by each customer comparing all transactions made by the customer (default frame here is UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING). The following query shows how to find the least expensive transaction.

customer_id
payment_date
amount
min

hashtag
Find the average transaction amount by customer ID

Calculate a customer’s average transaction amount for all transactions they’ve made (default frame here is UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING).

customer_id
payment_date
amount
avg

hashtag
Rank year-to-date sales for a sales team

Use ROW_NUMBER() to rank team members by their year-to-date sales (default frame here is UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING).

Row
FirstName
LastName
Total sales YTD

hashtag
Count the number of transactions by customer ID

Count the number of transactions made by each customer (default frame here is UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING).

customer_id
payment_date
amount
count

To query with Windows functions in Pinot, see examples for supported Windows functions.

CURRENT ROW expression FOLLOWING. May only be allowed in ROWS mode [depends on DB, some support some don’t]

  • UNBOUNDED FOLLOWING:

    • If no FRAME clause is specified, then the default frame behavior depends on whether ORDER BY is present or not.

    • If an ORDER BY clause is specified, the default behavior is to calculate the aggregation from the beginning of the partition to the current row or UNBOUNDED PRECEDING to CURRENT ROW.

    • If only a PARTITION BY clause is present, the default frame behavior is to calculate the aggregation from UNBOUNDED PRECEDING to CURRENT ROW.

  • Returns true if at least one input value is true, otherwise false

    Returns the count of the records as Long

    COUNT(*)

    0

    Returns the minimum value of a numeric column as Double

    MIN(playerScore)

    Double.POSITIVE_INFINITY

    Returns the maximum value of a numeric column as Double

    MAX(playerScore)

    Double.NEGATIVE_INFINITY

    Assigns a unique row number to all the rows in a specified table.

    ROW_NUMBER()

    0

    Returns the sum of the values for a numeric column as Double

    SUM(playerScore)

    0

    Rank year-to-date sales for a sales team
  • Count the number of transactions by customer ID

  • 9.99

    16.97

    1

    2023-02-16 13:47:23.996577

    4.99

    21.96

    2

    2023-02-17 19:23:24.996577

    2.99

    2.99

    2

    2023-02-17 19:23:24.996577

    0.99

    3.98

    3

    2023-02-16 00:02:31.996577

    8.99

    8.99

    3

    2023-02-16 13:47:36.996577

    6.99

    15.98

    3

    2023-02-17 03:43:41.996577

    6.99

    22.97

    4

    2023-02-15 07:59:54.996577

    4.99

    4.99

    4

    2023-02-16 06:37:06.996577

    0.99

    5.98

    9.99

    0.99

    2

    2023-04-30 04:34:36.996577

    4.99

    4.99

    2

    2023-04-30 12:16:09.996577

    10.99

    4.99

    3

    2023-03-23 05:38:40.996577

    2.99

    2.99

    3

    2023-04-07 08:51:51.996577

    3.99

    2.99

    3

    3 | 2023-04-08 11:15:37.996577

    4.99

    2.99

    9.99

    5.66

    2

    2023-04-30 04:34:36.996577

    4.99

    7.99

    2

    2023-04-30 12:16:09.996577

    10.99

    7.99

    3

    2023-03-23 05:38:40.996577

    2.99

    3.99

    3

    2023-04-07 08:51:51.996577

    3.99

    3.99

    3

    2023-04-08 11:15:37.996577

    4.99

    3.99

    Jones

    1551363.54

    4

    Dane

    Scott

    1251358.72

    23.50

    3

    2

    2023-04-07 08:51:51.996577

    12.35

    3

    2

    2023-04-08 11:15:37.996577

    8.29

    3

    AVGarrow-up-right

    Returns the average of the values for a numeric column as aDouble over the specified number of rows or partition (if applicable).

    AVG(playerScore)

    Double.NEGATIVE_INFINITY

    BOOL_AND

    Returns true if all input values are true, otherwise false

    1

    2023-02-14 23:22:38.996577

    5.99

    5.99

    1

    2023-02-15 16:31:19.996577

    0.99

    6.98

    1

    1

    2023-02-14 23:22:38.996577

    5.99

    0.99

    1

    2023-02-15 16:31:19.996577

    0.99

    0.99

    1

    1

    2023-02-14 23:22:38.996577

    5.99

    5.66

    1

    2023-02-15 16:31:19.996577

    0.99

    5.66

    1

    1

    Joe

    Smith

    2251368.34

    2

    Alice

    Davis

    2151341.64

    3

    1

    2023-02-14 23:22:38.99657

    10.99

    2

    1

    2023-02-15 16:31:19.996577

    8.99

    2

    2

    Compute averages
    Rank items
    Calculate sums
    Find minimum or maximum values
    OVER clause
    Windows function query syntax
    OVER clause
    Supported Pinot window functions
    Calcite documentationarrow-up-right
    supported Windows function
    Sum transactions by customer ID
    Find the minimum or maximum transaction by customer ID
    Find the average transaction amount by customer ID

    BOOL_OR

    2023-02-15 19:37:12.996577

    2023-02-15 19:37:12.996577

    2023-02-15 19:37:12.996577

    James

    2023-04-30 04:34:36.996577

    SELECT FUNC(column1) OVER (PARTITION BY column2 ORDER BY column3 ROWS 2 PRECEDING)
        FROM tableName
        WHERE filter_clause  
    SELECT customer_id, payment_date, amount, SUM(amount) OVER(PARTITION BY customer_id ORDER BY payment_date) from payment;
    SELECT customer_id, payment_date, amount, MIN(amount) OVER(PARTITION BY customer_id) from payment;
    SELECT customer_id, payment_date, amount, AVG(amount) OVER(PARTITION BY customer_id) from payment;
    SELECT ROW_NUMBER() OVER(ORDER BY SalesYTD DESC) AS Row,   
        FirstName, LastName AS "Total sales YTD"   
    FROM Sales.vSalesPerson;  
    SELECT customer_id, payment_date, amount, count(amount) OVER(PARTITION BY customer_id) from payment;
    COUNT
    MIN
    MAX
    ROW_NUMBER
    SUM