arrow-left

All pages
gitbookPowered by GitBook
1 of 4

Loading...

Loading...

Loading...

Loading...

Releases

The following summarizes Pinot's releases, from the latest one to the earliest one.

hashtag
Note

Before upgrading from one version to another one, please read the release notes as there may be some incompatibilities between versions.

hashtag
0.3.0 (March 2020)

hashtag
0.2.0 (November 2019)

hashtag
0.1.0 (March 2019, First release)

0.3.0chevron-right
0.2.0chevron-right
0.1.0chevron-right

0.1.0

The 0.1.0 is first release of Pinot as an Apache project

hashtag
New Features

  • First release

  • Off-line data ingestion from Apache Hadoop

  • Real-time data ingestion from Apache Kafka

0.2.0

The 0.2.0 release is the first release after the initial one and includes several improvements, reported following.

hashtag
New Features and Bug Fixes

  • Added support for Kafka 2.0

  • Table rebalancer now supports a minimum number of serving replicas during rebalance

  • Added support for UDF in filter predicates and selection

  • Added support to use hex string as the representation of byte array for queries (see PR )

  • Added support for parquet reader (see PR )

  • Introduced interface stability and audience annotations (see PR )

  • Refactor HelixBrokerStarter to separate constructor and start() - backwards incompatible (see PR )

  • Admin tool for listing segments with invalid intervals for offline tables

  • Migrated to log4j2 (see PR )

  • Added simple avro msg decoder

  • Added support for passing headers in Pinot client

  • Table rebalancer now supports a minimum number of serving replicas during rebalance

  • Support transform functions with AVG aggregation function (see PR )

  • Configurations additions/changes

    • Allow customized metrics prefix (see PR )

    • Controller.enable.batch.message.mode to false by default (see PR )

hashtag
Work in Progress

  • We are in the process of separating Helix and Pinot controllers, so that administrators can have the option of running independent Helix controllers and Pinot controllers.

  • We are in the process of moving towards supporting SQL query format and results.

  • We are in the process of separating instance and segment assignment using instance pools to optimize the number of Helix state transitions in Pinot clusters with thousands of tables.

hashtag
Other Notes

  • Task management does not work correctly in this release, due to bugs in Helix. We will upgrade to Helix 0.9.2 (or later) version to get this fixed.

  • You must upgrade to this release before moving onto newer versions of Pinot release. The protocol between Pinot-broker and Pinot-server has been changed and this release has the code to retain compatibility moving forward. Skipping this release may (depending on your environment) cause query errors if brokers are upgraded and servers are in the process of being upgraded.

  • As always, we recommend that you upgrade controllers first, and then brokers and lastly the servers in order to have zero downtime in production clusters.

RetentionManager and OfflineSegmentIntervalChecker initial delays configurable (see PR )

  • Config to control kafka fetcher size and increase default (see PR )

  • Added a percent threshold to consider startup of services (see PR )

  • Make SingleConnectionBrokerRequestHandler as default (see PR )

  • Always enable default column feature, remove the configuration (see PR )

  • Remove redundant default broker configurations (see PR )

  • Removed some config keys in server(see PR )

  • Add config to disable HLC realtime segment (see PR )

  • Make RetentionManager and OfflineSegmentIntervalChecker initial delays configurable (see PR )

  • The following config variables are deprecated and will be removed in the next release:

    • pinot.broker.requestHandlerType will be removed, in favor of using the "singleConnection" broker request handler. If you have set this configuration, please remove it and use the default type ("singleConnection") for broker request handler.

  • Pull Request introduces a backwards incompatible change to Pinot broker. If you use the Java constructor on HelixBrokerStarter class, then you will face a compilation error with this version. You will need to construct the object and call start() method in order to start the broker.

  • Pull Request introduces a backwards incompatible change for log4j configuration. If you used a custom log4j configuration (log4j.xml), you need to write a new log4j2 configuration (log4j2.xml). In addition, you may need to change the arguments on the command line to start Pinot components.

    If you used Pinot-admin command to start Pinot components, you don't need any change. If you used your own commands to start pinot components, you will need to pass the new log4j2 config as a jvm parameter (i.e. substitute -Dlog4j.configuration or -Dlog4j.configurationFile argument with -Dlog4j2.configurationFile=log4j2.xml).

  • #4041arrow-up-right
    #3852arrow-up-right
    #4063arrow-up-right
    #4100arrow-up-right
    #4139arrow-up-right
    #4557arrow-up-right
    #4392arrow-up-right
    #3928arrow-up-right
    #3946arrow-up-right
    #3869arrow-up-right
    #4011arrow-up-right
    #4048arrow-up-right
    #4074arrow-up-right
    #4106arrow-up-right
    #4222arrow-up-right
    #4235arrow-up-right
    #3946arrow-up-right
    #4100arrow-up-right
    #4139arrow-up-right

    0.3.0

    0.3.0 release of Apache Pinot introduces the concept of plugins that makes it easy to extend and integrate with other systems.

    hashtag
    What's the big change?

    The reason behind the architectural change from the previous release (0.2.0) and this release (0.3.0), is the possibility of extending Apache Pinot. The 0.2.0 release was not flexible enough to support new storage types nor new stream types. Basically, inserting a new functionality required to change too much code. Thus, the Pinot team went through an extensive refactoring and improvement of the source code.

    For instance, the picture below shows the module dependencies of the 0.2.X or previous releases. If we wanted to support a new storage type, we would have had to change several modules. Pretty bad, huh?

    In order to conquer this challenge, below major changes are made:

    • Refactored common interfaces to pinot-spi module

    • Concluded four types of modules:

      • Pinot input format: How to read records from various data/file formats: e.g. Avro

    Now the architecture supports a plug-and-play fashion, where new tools can be supported with little and simple extensions, without affecting big chunks of code. Integrations with new streaming services and data formats can be developed in a much more simple and convenient way.

    hashtag
    Notable New Features

    • SQL Support

      • Added Calcite SQL compiler

      • Added SQL response format (, )

    hashtag
    Major Bug Fixes

    • Fixed the bug of releasing the segment when there are still threads working on it. ()

    • Fixed the bug of uneven task distribution for threads ()

    • Fixed encryption for .tar.gz segment file upload ()

    hashtag
    Work in Progress

    • We are in the process of supporting text search query functionalities.

    • We are in the process of supporting null value (), currently limited query feature is supported

      • Added Presence Vector to represent null value ()

    hashtag
    Backward Incompatible Changes

    • It’s a disruptive upgrade from version 0.1.0 to this because of the protocol changes between Pinot Broker and Pinot Server. Please ensure that you upgrade to release 0.2.0 first, then upgrade to this version.

    • If you build your own startable or war without using scripts generated in Pinot-distribution module. For Java 8, an environment variable “plugins.dir” is required for Pinot to find out where to load all the Pinot plugin jars. For Java 11, plugins directory is required to be explicitly set into classpath. Please see pinot-admin.sh as an example.

    /
    CSV
    /
    JSON
    /
    ORC
    /
    Parquet
    /
    Thrift
  • Pinot filesystem: How to operate files on various filesystems: e.g. Azure Data Lake/Google Cloud Storage/S3/HDFS

  • Pinot stream ingestion: How to ingest data stream from various upstream systems, e.g. Kafka/Kinesis/Eventhub

  • Pinot batch ingestion: How to run Pinot batch ingestion jobs in various frameworks, like Standalone, Hadoop, Spark.

  • Built shaded jars for each individual plugin

  • Added support to dynamically load pinot plugins at server startup time

  • Added support for GROUP BY with ORDER BY (#4602arrow-up-right)

  • Query console defaults to use SQL syntax (#4994arrow-up-right)

  • Support column alias (#5016arrow-up-right, #5033arrow-up-right)

  • Added SQL query endpoint: /query/sql (#4964arrow-up-right)

  • Support arithmetic operators (#5018arrow-up-right)

  • Support non-literal expressions for right-side operand in predicate comparison(#5070arrow-up-right)

  • Added support for DISTINCT (#4535arrow-up-right)

  • Added support default value for BYTES column (#4583arrow-up-right)

  • JDK 11 Support

  • Added support to tune size vs accuracy for approximation aggregation functions: DistinctCountHLL, PercentileEst, PercentileTDigest (#4666arrow-up-right)

  • Added Data Anonymizer Tool (#4747arrow-up-right)

  • Deprecated pinot-hadoop and pinot-spark modules, replace with pinot-batch-ingestion-hadoop and pinot-batch-ingestion-spark

  • Support STRING and BYTES for no dictionary columns in realtime consuming segments (#4791arrow-up-right)

  • Make pinot-distribution to build a pinot-all jar and assemble it (#4977arrow-up-right)

  • Added support for PQL case insensitive (#4983arrow-up-right)

  • Enhanced TableRebalancer logics

    • Moved to new rebalance strategy (#4695arrow-up-right)

    • Supported rebalancing tables under any condition(#4990arrow-up-right)

    • Supported reassigning completed segments along with Consuming segments for LLC realtime table ()

  • Added experimental support for Text Search‌ (#4993arrow-up-right)

  • Upgraded Helix to version 0.9.4, task management now works as expected (#5020arrow-up-right)

  • Added date_trunc transformation function. (#4740arrow-up-right)

  • Support schema evolution for consuming segment. (#4954arrow-up-right)

  • APIs Additions/Changes

    • Pinot Admin Command

      • Added -queryType option in PinotAdmin PostQuery subcommand ()

      • Added -schemaFile as option in AddTable command ()

      • Added OperateClusterConfig sub command in PinotAdmin ()

    • Pinot Controller Rest APIs

      • Get Table leader controller resource ()

      • Support HTTP POST/PUT to upload JSON encoded schema (

  • Configurations Additions/Changes

    • Config: controller.host is now optional in Pinot Controller

    • Added instance config: queriesDisabled to disable query sending to a running server (#4767arrow-up-right)

    • Added broker config: pinot.broker.enable.query.limit.override configurable max query response size ()

    • Removed deprecated server configs ()

      • pinot.server.starter.enableSegmentsLoadingCheck

      • pinot.server.starter.timeoutInSeconds

    • Decouple server instance id with hostname/port config. ()

    • Add FieldConfig to encapsulate encoding, indexing info for a field.()

  • Fixed controller rest API to download segment from non local FS. (#4808arrow-up-right)

  • Fixed the bug of not releasing segment lock if segment recovery throws exception (#4882arrow-up-right)

  • Fixed the issue of server not registering state model factory before connecting the Helix manager (#4929arrow-up-right)

  • Fixed the exception in server instance when Helix starts a new ZK session (#4976arrow-up-right)

  • Fixed ThreadLocal DocIdSet issue in ExpressionFilterOperator (#5114arrow-up-right)

  • Fixed the bug in default value provider classes (#5137arrow-up-right)

  • Fixed the bug when no segment exists in RealtimeSegmentSelector (#5138arrow-up-right)

  • Added null predicate support for leaf predicates (#4943arrow-up-right)

    As always, we recommend that you upgrade controllers first, and then brokers and lastly the servers in order to have zero downtime in production clusters.
  • Kafka 0.9 is no longer included in the release distribution.

  • Pull request #4806arrow-up-right introduces a backward incompatible API change for segments management.

    • Removed segment toggle APIs

    • Removed list all segments in cluster APIs

    • Deprecated below APIs:

      • GET /tables/{tableName}/segments

      • GET /tables/{tableName}/segments/metadata

  • Pull request #5054arrow-up-right deprecated below task related APIs:

    • GET:

      • /tasks/taskqueues: List all task queues

      • /tasks/taskqueuestate/{taskType} -> /tasks/{taskType}/state

      • /tasks/tasks/{taskType} -> /tasks/{taskType}/tasks

      • /tasks/taskstates/{taskType} -> /tasks/{taskType}/taskstates

      • /tasks/taskstate/{taskName} -> /tasks/task/{taskName}/taskstate

      • /tasks/taskconfig/{taskName} -> /tasks/task/{taskName}/taskconfig

    • PUT:

      • /tasks/scheduletasks -> POST /tasks/schedule

    • DELETE:

      • /tasks/taskqueue/{taskType} -> /tasks/{taskType}

  • Deprecated modules pinot-hadoop and pinot-spark and replaced with pinot-batch-ingestion-hadoop and pinot-batch-ingestion-spark.

  • Introduced new Pinot batch ingestion jobs and yaml based job specs to define segment generation jobs and segment push jobs.

  • You may see exceptions like below in pinot-brokers during cluster upgrade, but it's safe to ignore them.

  • #4694arrow-up-right
    #4877arrow-up-right
    #4764arrow-up-right
    #4793arrow-up-right
    #4855arrow-up-right
    #4230arrow-up-right
    #4585arrow-up-right
    0.2.0 and before Pinot Module Dependency Diagram
    Dependency graph after introducing pinot-plugin in 0.3.0
    2020/03/09 23:37:19.879 ERROR [HelixTaskExecutor] [CallbackProcessor@b808af5-pinot] [pinot-broker] [] Message cannot be processed: 78816abe-5288-4f08-88c0-f8aa596114fe, {CREATE_TIMESTAMP=1583797034542, MSG_ID=78816abe-5288-4f08-88c0-f8aa596114fe, MSG_STATE=unprocessable, MSG_SUBTYPE=REFRESH_SEGMENT, MSG_TYPE=USER_DEFINE_MSG, PARTITION_NAME=fooBar_OFFLINE, RESOURCE_NAME=brokerResource, RETRY_COUNT=0, SRC_CLUSTER=pinot, SRC_INSTANCE_TYPE=PARTICIPANT, SRC_NAME=Controller_hostname.domain,com_9000, TGT_NAME=Broker_hostname,domain.com_6998, TGT_SESSION_ID=f6e19a457b80db5, TIMEOUT=-1, segmentName=fooBar_559, tableName=fooBar_OFFLINE}{}{}
    java.lang.UnsupportedOperationException: Unsupported user defined message sub type: REFRESH_SEGMENT
          at org.apache.pinot.broker.broker.helix.TimeboundaryRefreshMessageHandlerFactory.createHandler(TimeboundaryRefreshMessageHandlerFactory.java:68) ~[pinot-broker-0.2.1172.jar:0.3.0-SNAPSHOT-c9d88e47e02d799dc334d7dd1446a38d9ce161a3]
          at org.apache.helix.messaging.handling.HelixTaskExecutor.createMessageHandler(HelixTaskExecutor.java:1096) ~[helix-core-0.9.1.509.jar:0.9.1.509]
          at org.apache.helix.messaging.handling.HelixTaskExecutor.onMessage(HelixTaskExecutor.java:866) [helix-core-0.9.1.509.jar:0.9.1.509]
    )
  • Table rebalance API now requires both table name and type as parameters. (#4824arrow-up-right)

  • Refactored Segments APIs (#4806arrow-up-right)

  • Added segment batch deletion REST API (#4828arrow-up-right)

  • Update schema API to reload table on schema change when applicable (#4838arrow-up-right)

  • Enhance the task related REST APIs (#5054arrow-up-right)

  • Added PinotClusterConfig REST APIs (#5073arrow-up-right)

    • GET /cluster/configs

    • POST /cluster/configs

    • DELETE /cluster/configs/{configName}

  • pinot.server.instance.enable.shutdown.delay

  • pinot.server.instance.starter.maxShutdownWaitTime

  • pinot.server.instance.starter.checkIntervalTime

  • GET /tables/{tableName}/segments/crc
  • GET /tables/{tableName}/segments/{segmentName}

  • GET /tables/{tableName}/segments/{segmentName}/metadata

  • GET /tables/{tableName}/segments/{segmentName}/reload

  • POST /tables/{tableName}/segments/{segmentName}/reload

  • GET /tables/{tableName}/segments/reload

  • POST /tables/{tableName}/segments/reload

  • /tasks/cleanuptasks/{taskType} -> /tasks/{taskType}/cleanup
  • /tasks/taskqueue/{taskType}: Toggle a task queue

  • #5015arrow-up-right
    #4726arrow-up-right
    #4959arrow-up-right
    #5073arrow-up-right
    #4545arrow-up-right
    #4639arrow-up-right
    #5040arrow-up-right
    #4903arrow-up-right
    #4995arrow-up-right
    #5006arrow-up-right