Transform
Describes the transform relation operator in the multi-stage query engine.
The transform operator is used to apply a transformation to the input data. They may filter out columns or add new ones by applying functions to the existing columns. This operator is generated by the multi-stage query engine when you use a SELECT
clause in a query, but can also be used to implement other transformations.
Implementation details
Transform operators apply some transformation functions to the input data received from upstream. The cost of the transformation usually depends on the complexity of the functions applied, but comparing to other operators, it is usually not very high.
Blocking nature
The transform operator is a streaming operator. It emits the blocks of rows as soon as they are received from the upstream operator.
Hints
None
Stats
executionTimeMs
Type: Long
The summation of time spent by all threads executing the operator. This means that the wall time spent in the operation may be smaller that this value if the parallelism is larger than 1.
emittedRows
Type: Long
The number of groups emitted by the operator.
Explain attributes
The transform operator is represented in the explain plan as a LogicalProject
explain node.
This explain node has a list of attributes that represent the transformations applied to the input data. Each attribute has a name and a value, which is the expression used to generate the column.
For example:
Is saying that the output of the operator has three columns:
userUUID
is the 7th column in the virtual row projected by LogicalTableScan, which corresponds to theuserUUID
column in the table.deviceOS
is the 5th column in the virtual row projected by LogicalTableScan, which corresponds to thedeviceOS
column in the table.EXPR$2
is the result of theSUBSTRING($4, 0, 2)
expression applied to the 5th column in the virtual row projected by LogicalTableScan. Given we know that the 5th column isdeviceOS
, we can infer thatEXPR$2
is the first two characters of thedeviceOS
column.
Tips and tricks
None