# 0.5.0

## Summary

This release includes many new features on Pinot ingestion and connectors (e.g., support for filtering during ingestion which is configurable in table config; support for json during ingestion; proto buf input format support and a new Pinot JDBC client), query capability (e.g., a new GROOVY transform function UDF) and admin functions (a revamped Cluster Manager UI & Query Console UI). It also contains many key bug fixes. See details below.

The release was cut from the following commit:\
[d1b4586](https://github.com/apache/pinot/commit/d1b458644e505f07c919e7eb983feb9dafcd0061)\
and the following cherry-picks:

* [63a4fd4](https://github.com/apache/pinot/commit/63a4fd4)
* [a7f7f46](https://github.com/apache/pinot/commit/a7f7f46)
* [dafbef1](https://github.com/apache/pinot/commit/dafbef1)
* [ced3a70](https://github.com/apache/pinot/commit/ced3a70)
* [d902c1a](https://github.com/apache/pinot/commit/d902c1a)

## Notable New Features

* Allowing update on an existing instance config: PUT /instances/{instanceName} with Instance object as the pay-load ([#PR4952](https://github.com/apache/pinot/pull/4952))
* Add PinotServiceManager to start Pinot components ([#PR5266](https://github.com/apache/pinot/pull/5266))
* Support for protocol buffers input format. ([#PR5293](https://github.com/apache/pinot/pull/5293))
* Add GenericTransformFunction wrapper for simple ScalarFunctions ([PR#5440](https://github.com/apache/pinot/pull/5440)) — Adding support to invoke any scalar function via GenericTransformFunction
* Add Support for SQL CASE Statement ([PR#5461](https://github.com/apache/pinot/pull/5461))
* Support distinctCountRawThetaSketch aggregation that returns serialized sketch. ([PR#5465](https://github.com/apache/pinot/pull/5465))
* Add multi-value support to SegmentDumpTool ([PR#5487](https://github.com/apache/pinot/pull/5487)) — add segment dump tool as part of the pinot-tool.sh script
* Add json\_format function to convert json object to string during ingestion. ([PR#5492](https://github.com/apache/pinot/pull/5492)) — Can be used to store complex objects as a json string (which can later be queries using jsonExtractScalar)
* Support escaping single quote for SQL literal ([PR#5501](https://github.com/apache/pinot/pull/5501)) — This is especially useful for DistinctCountThetaSketch because it stores expression as literal E.g. DistinctCountThetaSketch(..., 'foo=''bar''', ...)
* Support expression as the left-hand side for BETWEEN and IN clause ([PR#5502](https://github.com/apache/pinot/pull/5502))
* Add a new field IngestionConfig in TableConfig — FilterConfig: ingestion level filtering of records, based on filter function. ([PR#5597](https://github.com/apache/pinot/pull/5597)) — TransformConfig: ingestion level column transformations. This was previously introduced in Schema (FieldSpec#transformFunction), and has now been moved to TableConfig. It continues to remain under schema, but we recommend users to set it in the TableConfig starting this release ([PR#5681](https://github.com/apache/pinot/pull/5681)).
* Allow star-tree creation during segment load ([#PR5641](https://github.com/apache/pinot/pull/5641)) — Introduced a new boolean config enableDynamicStarTreeCreation in IndexingConfig to enable/disable star-tree creation during segment load.
* Support for Pinot clients using JDBC connection ([#PR5602](https://github.com/apache/pinot/pull/5602))
* Support customized accuracy for distinctCountHLL, distinctCountHLLMV functions by adding log2m value as the second parameter in the function. ([#PR5564](https://github.com/apache/pinot/pull/5564)) —Adding cluster config: default.hyperloglog.log2m to allow user set default log2m value.
* Add segment encryption on Controller based on table config ([PR#5617](https://github.com/apache/pinot/pull/5617))
* Add a constraint to the message queue for all instances in Helix, with a large default value of 100000. ([PR#5631](https://github.com/apache/pinot/pull/5631))
* Support order-by aggregations not present in SELECT ([PR#5637](https://github.com/apache/pinot/pull/5637)) — Example: "select subject from transcript group by subject order by count(*) desc" This is equivalent to the following query but the return response should not contain count(*). "select subject, count(*) from transcript group by subject order by count(*) desc"
* Add geo support for Pinot queries ([PR#5654](https://github.com/apache/pinot/pull/5654)) — Added geo-spatial data model and geospatial functions
* Cluster Manager UI & Query Console UI revamp ([PR#5684](https://github.com/apache/pinot/pull/5684) and [PR#5732](https://github.com/apache/pinot/pull/5732)) — updated cluster manage UI and added table details page and segment details page
* Add Controller API to explore Zookeeper ([PR#5687](https://github.com/apache/pinot/pull/5687))
* Support BYTES type for dictinctCount and group-by ([PR#5701](https://github.com/apache/pinot/pull/5701) and [PR#5708](https://github.com/apache/pinot/pull/5708)) —Add BYTES type support to `DistinctCountAggregationFunction` —Correctly handle BYTES type in `DictionaryBasedAggregationOperator` for DistinctCount
* Support for ingestion job spec in JSON format ([#PR5729](https://github.com/apache/pinot/pull/5729))
* Improvements to RealtimeProvisioningHelper command ([#PR5737](https://github.com/apache/pinot/pull/5737)) — Improved docs related to ingestion and plugins
* Added GROOVY transform function UDF ([#PR5748](https://github.com/apache/pinot/pull/5748)) — Ability to run a groovy script in the query as a UDF. e.g. string concatenation: SELECT GROOVY('{"returnType": "INT", "isSingleValue": true}', 'arg0 + " " + arg1', columnA, columnB) FROM myTable

## Special notes

* Changed the stream and metadata interface ([PR#5542](https://github.com/apache/pinot/pull/5542)) — This PR concludes the work for the issue [#5359](https://github.com/apache/pinot/issues/5359) to extend offset support for other streams
* TransformConfig: ingestion level column transformations. This was previously introduced in Schema (FieldSpec#transformFunction), and has now been moved to TableConfig. It continues to remain under schema, but we recommend users to set it in the TableConfig starting this release ([PR#5681](https://github.com/apache/pinot/pull/5681)).
* Config key enable.case.insensitive.pql in Helix cluster config is deprecated, and replaced with enable.case.insensitive. ([#PR5546](https://github.com/apache/pinot/pull/5546))
* Change default segment load mode to MMAP. ([PR#5539](https://github.com/apache/pinot/pull/5539)) —The load mode for segments currently defaults to `heap`.

## Major Bug fixes

* Fix bug in distinctCountRawHLL on SQL path ([#5494](https://github.com/apache/pinot/pull/5494))
* Fix backward incompatibility for existing stream implementations ([#5549](https://github.com/apache/pinot/pull/5549))
* Fix backward incompatibility in StreamFactoryConsumerProvider ([#5557](https://github.com/apache/pinot/pull/5557))
* Fix logic in isLiteralOnlyExpression. ([#5611](https://github.com/apache/pinot/pull/5611))
* Fix double memory allocation during operator setup ([#5619](https://github.com/apache/pinot/pull/5619))
* Allow segment download url in Zookeeper to be deep store uri instead of hardcoded controller uri ([#5639](https://github.com/apache/pinot/pull/5639))
* Fix a backward compatible issue of converting BrokerRequest to QueryContext when querying from Presto segment splits ([#5676](https://github.com/apache/pinot/pull/5676))
* Fix the issue that PinotSegmentToAvroConverter does not handle BYTES data type. ([#5789](https://github.com/apache/pinot/pull/5789))

## Backward Incompatible Changes

* PQL queries with HAVING clause will no longer be accepted for the following reasons: ([#PR5570](https://github.com/apache/pinot/pull/5570)) — HAVING clause does not apply to PQL GROUP-BY semantic where each aggregation column is ordered individually — The current behavior can produce inaccurate results without any notice — HAVING support will be added for SQL queries in the next release
* Because of the standardization of the DistinctCountThetaSketch predicate strings, please upgrade Broker before Server. The new Broker can handle both standard and non-standard predicate strings for backward-compatibility. ([#PR5613](https://github.com/apache/pinot/pull/5613))