EXPR_MIN / EXPR_MAX
This section contains reference documentation for the EXPR_MIN and EXPR_MAX function.
This function scans the given dataset to identify the maximum and minimum values in the specified measuring columns. Once these extreme values (the maxima and minima) are found, the function locates the corresponding entries in the projection column. These entries are associated with the rows where the extreme values were found in the measuring columns. The function then returns these projection column values, providing a way to link the extreme measurements with their corresponding data in another part of the dataset.
Signature
EXPR_MIN (projectionCol, measuringCol1, measuringCol2, measuringCol3)
EXPR_MAX (projectionCol, measuringCol1, measuringCol2, measuringCol3)
Usage Examples
Find the user with maximum activity. If there are multiple users, break the tie with their last_activity_date. If still a tie, break with user_id. And project user_id.
More useful is that this multiple such aggregation function can be used with GROUP BY
Note:
In cases where multiple rows share the same extreme values in the measuring columns, all such rows will be returned by the function.
If the goal is to project multiple different columns that correspond to the same set of measuring columns, you can achieve this by invoking the function multiple times, each time specifying a different projection column.
This impl does not work with AS clause (e.g.
SELECT exprmin(longCol, doubleCol) AS exprmin
won't work)Putting
exprmin/exprmax
column inside order by clause (e.g.SELECT intCol, exprmin(longCol, doubleCol) FROM table GROUP BY intCol ORDER BY exprmin(longCol, doubleCol)
) is not supported as semantically ordering multi-column multi-rowexprmin/exprmax
results doesn't make senseCurrently projecting MV bytes column doesn't work for now due to an issue
For more detailed examples, see: https://github.com/apache/pinot/pull/10636