LogoLogo
release-1.2.0
release-1.2.0
  • Introduction
  • Basics
    • Concepts
      • Pinot storage model
      • Architecture
      • Components
        • Cluster
          • Tenant
          • Server
          • Controller
          • Broker
          • Minion
        • Table
          • Segment
            • Deep Store
            • Segment threshold
            • Segment retention
          • Schema
          • Time boundary
        • Pinot Data Explorer
    • Getting Started
      • Running Pinot locally
      • Running Pinot in Docker
      • Quick Start Examples
      • Running in Kubernetes
      • Running on public clouds
        • Running on Azure
        • Running on GCP
        • Running on AWS
      • Create and update a table configuration
      • Batch import example
      • Stream ingestion example
      • HDFS as Deep Storage
      • Troubleshooting Pinot
      • Frequently Asked Questions (FAQs)
        • General
        • Pinot On Kubernetes FAQ
        • Ingestion FAQ
        • Query FAQ
        • Operations FAQ
    • Import Data
      • From Query Console
      • Batch Ingestion
        • Spark
        • Flink
        • Hadoop
        • Backfill Data
        • Dimension table
      • Stream ingestion
        • Ingest streaming data from Apache Kafka
        • Ingest streaming data from Amazon Kinesis
        • Ingest streaming data from Apache Pulsar
        • Configure indexes
      • Stream ingestion with Upsert
      • Segment compaction on upserts
      • Stream ingestion with Dedup
      • Stream ingestion with CLP
      • File Systems
        • Amazon S3
        • Azure Data Lake Storage
        • HDFS
        • Google Cloud Storage
      • Input formats
        • Complex Type (Array, Map) Handling
        • Ingest records with dynamic schemas
      • Reload a table segment
      • Upload a table segment
    • Indexing
      • Bloom filter
      • Dictionary index
      • Forward index
      • FST index
      • Geospatial
      • Inverted index
      • JSON index
      • Native text index
      • Range index
      • Star-tree index
      • Text search support
      • Timestamp index
    • Release notes
      • 1.1.0
      • 1.0.0
      • 0.12.1
      • 0.12.0
      • 0.11.0
      • 0.10.0
      • 0.9.3
      • 0.9.2
      • 0.9.1
      • 0.9.0
      • 0.8.0
      • 0.7.1
      • 0.6.0
      • 0.5.0
      • 0.4.0
      • 0.3.0
      • 0.2.0
      • 0.1.0
    • Recipes
      • Connect to Streamlit
      • Connect to Dash
      • Visualize data with Redash
      • GitHub Events Stream
  • For Users
    • Query
      • Querying Pinot
      • Query Syntax
        • Aggregation Functions
        • Cardinality Estimation
        • Explain Plan (Single-Stage)
        • Explain Plan (Multi-Stage)
        • Filtering with IdSet
        • GapFill Function For Time-Series Dataset
        • Grouping Algorithm
        • JOINs
        • Lookup UDF Join
        • Querying JSON data
        • Transformation Functions
        • Window aggregate
        • Funnel Analysis
      • Query Options
      • Multi stage query
        • Operator Types
          • Aggregate
          • Filter
          • Join
          • Intersect
          • Leaf
          • Literal
          • Mailbox receive
          • Mailbox send
          • Minus
          • Sort or limit
          • Transform
          • Union
          • Window
        • Understanding Stages
        • Explain
        • Stats
      • User-Defined Functions (UDFs)
    • APIs
      • Broker Query API
        • Query Response Format
      • Controller Admin API
      • Controller API Reference
    • External Clients
      • JDBC
      • Java
      • Python
      • Golang
    • Tutorials
      • Use OSS as Deep Storage for Pinot
      • Ingest Parquet Files from S3 Using Spark
      • Creating Pinot Segments
      • Use S3 as Deep Storage for Pinot
      • Use S3 and Pinot in Docker
      • Batch Data Ingestion In Practice
      • Schema Evolution
  • For Developers
    • Basics
      • Extending Pinot
        • Writing Custom Aggregation Function
        • Segment Fetchers
      • Contribution Guidelines
      • Code Setup
      • Code Modules and Organization
      • Dependency Management
      • Update documentation
    • Advanced
      • Data Ingestion Overview
      • Ingestion Aggregations
      • Ingestion Transformations
      • Null value support
      • Use the multi-stage query engine (v2)
      • Troubleshoot issues with the multi-stage query engine (v2)
      • Advanced Pinot Setup
    • Plugins
      • Write Custom Plugins
        • Input Format Plugin
        • Filesystem Plugin
        • Batch Segment Fetcher Plugin
        • Stream Ingestion Plugin
    • Design Documents
      • Segment Writer API
  • For Operators
    • Deployment and Monitoring
      • Set up cluster
      • Server Startup Status Checkers
      • Set up table
      • Set up ingestion
      • Decoupling Controller from the Data Path
      • Segment Assignment
      • Instance Assignment
      • Rebalance
        • Rebalance Servers
        • Rebalance Brokers
        • Rebalance Tenant
      • Separating data storage by age
        • Using multiple tenants
        • Using multiple directories
      • Pinot managed Offline flows
      • Minion merge rollup task
      • Consistent Push and Rollback
      • Access Control
      • Monitoring
      • Tuning
        • Real-time
        • Routing
        • Query Routing using Adaptive Server Selection
        • Query Scheduling
      • Upgrading Pinot with confidence
      • Managing Logs
      • OOM Protection Using Automatic Query Killing
    • Command-Line Interface (CLI)
    • Configuration Recommendation Engine
    • Tutorials
      • Authentication
        • Basic auth access control
        • ZkBasicAuthAccessControl
      • Configuring TLS/SSL
      • Build Docker Images
      • Running Pinot in Production
      • Kubernetes Deployment
      • Amazon EKS (Kafka)
      • Amazon MSK (Kafka)
      • Monitor Pinot using Prometheus and Grafana
      • Performance Optimization Configurations
  • Configuration Reference
    • Cluster
    • Controller
    • Broker
    • Server
    • Table
    • Ingestion
    • Schema
    • Ingestion Job Spec
    • Monitoring Metrics
    • Functions
      • ABS
      • ADD
      • ago
      • EXPR_MIN / EXPR_MAX
      • arrayConcatDouble
      • arrayConcatFloat
      • arrayConcatInt
      • arrayConcatLong
      • arrayConcatString
      • arrayContainsInt
      • arrayContainsString
      • arrayDistinctInt
      • arrayDistinctString
      • arrayIndexOfInt
      • arrayIndexOfString
      • ARRAYLENGTH
      • arrayRemoveInt
      • arrayRemoveString
      • arrayReverseInt
      • arrayReverseString
      • arraySliceInt
      • arraySliceString
      • arraySortInt
      • arraySortString
      • arrayUnionInt
      • arrayUnionString
      • AVGMV
      • Base64
      • caseWhen
      • ceil
      • CHR
      • codepoint
      • concat
      • count
      • COUNTMV
      • COVAR_POP
      • COVAR_SAMP
      • day
      • dayOfWeek
      • dayOfYear
      • DISTINCT
      • DISTINCTAVG
      • DISTINCTAVGMV
      • DISTINCTCOUNT
      • DISTINCTCOUNTBITMAP
      • DISTINCTCOUNTHLLMV
      • DISTINCTCOUNTHLL
      • DISTINCTCOUNTBITMAPMV
      • DISTINCTCOUNTMV
      • DISTINCTCOUNTRAWHLL
      • DISTINCTCOUNTRAWHLLMV
      • DISTINCTCOUNTRAWTHETASKETCH
      • DISTINCTCOUNTTHETASKETCH
      • DISTINCTSUM
      • DISTINCTSUMMV
      • DIV
      • DATETIMECONVERT
      • DATETRUNC
      • exp
      • FIRSTWITHTIME
      • FLOOR
      • FrequentLongsSketch
      • FrequentStringsSketch
      • FromDateTime
      • FromEpoch
      • FromEpochBucket
      • FUNNELCOUNT
      • FunnelCompleteCount
      • FunnelMaxStep
      • FunnelMatchStep
      • Histogram
      • hour
      • isSubnetOf
      • JSONFORMAT
      • JSONPATH
      • JSONPATHARRAY
      • JSONPATHARRAYDEFAULTEMPTY
      • JSONPATHDOUBLE
      • JSONPATHLONG
      • JSONPATHSTRING
      • jsonextractkey
      • jsonextractscalar
      • LAG
      • LASTWITHTIME
      • LEAD
      • length
      • ln
      • lower
      • lpad
      • ltrim
      • max
      • MAXMV
      • MD5
      • millisecond
      • min
      • minmaxrange
      • MINMAXRANGEMV
      • MINMV
      • minute
      • MOD
      • mode
      • month
      • mult
      • now
      • percentile
      • percentileest
      • percentileestmv
      • percentilemv
      • percentiletdigest
      • percentiletdigestmv
      • percentilekll
      • percentilerawkll
      • percentilekllmv
      • percentilerawkllmv
      • quarter
      • regexpExtract
      • regexpReplace
      • remove
      • replace
      • reverse
      • round
      • ROW_NUMBER
      • rpad
      • rtrim
      • second
      • SEGMENTPARTITIONEDDISTINCTCOUNT
      • sha
      • sha256
      • sha512
      • sqrt
      • startswith
      • ST_AsBinary
      • ST_AsText
      • ST_Contains
      • ST_Distance
      • ST_GeogFromText
      • ST_GeogFromWKB
      • ST_GeometryType
      • ST_GeomFromText
      • ST_GeomFromWKB
      • STPOINT
      • ST_Polygon
      • strpos
      • ST_Union
      • SUB
      • substr
      • sum
      • summv
      • TIMECONVERT
      • timezoneHour
      • timezoneMinute
      • ToDateTime
      • ToEpoch
      • ToEpochBucket
      • ToEpochRounded
      • TOJSONMAPSTR
      • toGeometry
      • toSphericalGeography
      • trim
      • upper
      • Url
      • UTF8
      • VALUEIN
      • week
      • year
      • yearOfWeek
      • Extract
    • Plugin Reference
      • Stream Ingestion Connectors
      • VAR_POP
      • VAR_SAMP
      • STDDEV_POP
      • STDDEV_SAMP
    • Dynamic Environment
  • Reference
    • Single-stage query engine (v1)
    • Multi-stage query engine (v2)
    • Troubleshooting
      • Troubleshoot issues with the multi-stage query engine (v2)
      • Troubleshoot issues with ZooKeeper znodes
  • RESOURCES
    • Community
    • Team
    • Blogs
    • Presentations
    • Videos
  • Integrations
    • Tableau
    • Trino
    • ThirdEye
    • Superset
    • Presto
    • Spark-Pinot Connector
  • Contributing
    • Contribute Pinot documentation
    • Style guide
Powered by GitBook
On this page
  • Summary
  • Dependency Graph
  • SQL Improvements
  • UI Enhancements
  • Performance Improvements
  • Other Notable Features
  • Major Bug Fixes
  • Backward Incompatible Changes

Was this helpful?

Edit on GitHub
Export as PDF
  1. Basics
  2. Release notes

0.10.0

Previous0.11.0Next0.9.3

Was this helpful?

Summary

This release introduces some new great features, performance enhancements, UI improvements, and bug fixes which are described in details in the following sections. The release was cut from this commit .

Dependency Graph

The dependency graph for plug-and-play architecture that was introduced in release has been extended and now it contains new nodes for Pinot Segment SPI.

SQL Improvements

UI Enhancements

Performance Improvements

Other Notable Features

Major Bug Fixes

Backward Incompatible Changes

Implement NOT Operator

Add DistinctCountSmartHLLAggregationFunction which automatically store distinct values in Set or HyperLogLog based on cardinality

Add LEAST and GREATEST functions

Handle SELECT * with extra columns

Add FILTER clauses for aggregates

Add ST_Within function

Handle semicolon in query

Add EXPLAIN PLAN

Show Reported Size and Estimated Size in human readable format in UI

Make query console state URL based

Improve query console to not show query result when multiple columns have the same name

Improve Pinot dashboard tenant view to show correct amount of servers and brokers

Fix issue with opening new tabs from Pinot Dashboard

Fix issue with Query console going blank on syntax error

Make query stats always show even there's error

Implement OIDC auth workflow in UI

Add tooltip and modal for table status

Add option to wrap lines in custom code mirror

Add ability to comment out queries with cmd + /

Return exception when unavailable segments on empty broker response

Properly handle the case where segments are missing in externalview

Add TIMESTAMP to datetime column Type

Reuse regex matcher in dictionary based LIKE queries

Early terminate orderby when columns already sorted

Do not do another pass of Query Automaton Minimization

Improve RangeBitmap by upgrading RoaringBitmap

Optimize geometry serializer usage when literal is available

Improve performance of no-dictionary group by

Allocation free DataBlockCache lookups

Prune unselected THEN statements in CaseTransformFunction

Aggregation delay conversion to double

Reduce object allocation rate in ExpressionContext or FunctionContext

Lock free DimensionDataTableManager

Improve json path performance during ingestion by upgrading JsonPath

Reduce allocations and speed up StringUtil.sanitizeString

Faster metric scans - ForwardIndexReader

Unpeel group by 3 ways to enable vectorization

Power of 2 fixed size chunks

Don't use mmap for compression except for huge chunks

Exit group-by marking loop early

Improve performance of base chunk forward index write

Cache JsonPaths to prevent compilation per segment

Use LZ4 as default compression mode

Peel off special case for 1 dimensional groupby

Bump roaringbitmap version to improve range queries performance

Adding NoopPinotMetricFactory and corresponding changes

Allow to specify fixed segment name for SegmentProcessorFramework

Move all prestodb dependencies into a separated module

Include docIds in Projection and Transform block

Automatically update broker resource on broker changes

Update ScalarFunction annotation from name to names to support function alias.

Implemented BoundedColumnValue partition function

Add copy recursive API to pinotFS

Add Support for Getting Live Brokers for a Table (without type suffix)

Pinot docker image - cache prometheus rules

In BrokerRequestToQueryContextConverter, remove unused filterExpressionContext

Adding retention period to segment delete REST API

Pinot docker image - upgrade prometheus and scope rulesets to components

Allow segment name postfix for SegmentProcessorFramework

Superset docker image - update pinotdb version in superset image

Add retention period to deleted segment files and allow table level overrides

Remove incubator from pinot and superset

Adding table config overrides for disabling groovy

Optimise sorted docId iteration order in mutable segments

Adding secure grpc query server support

Move Tls configs and utils from pinot-core to pinot-common

Reduce allocation rate in LookupTransformFunction

Allow subclass to customize what happens pre/post segment uploading

Enable controller service auto-discovery in Jersey framework

Add support for pushFileNamePattern in pushJobSpec

Add additionalMatchLabels to helm chart

Simulate rsvps after meetup.com retired the feed

Adding more checkstyle rules

Add persistence.extraVolumeMounts and persistence.extraVolumes to Kubernetes statefulsets

Adding scala profile for kafka 2.x build and remove root pom scala dependencies

Allow real-time data providers to accept non-kafka producers

Enhance revertReplaceSegments api

Adding broker level config for disabling Pinot queries with Groovy

Make presto driver query pinot server with SQL

Adding controller config for disabling Groovy in ingestionConfig

Adding main method for LaunchDataIngestionJobCommand for spark-submit command

Add auth token for segment replace rest APIs

Add allowRefresh option to UploadSegment

Add Ingress to Broker and Controller helm charts

Improve progress reporter in SegmentCreationMapper

St_* function error messages + support literal transform functions

Add schema and segment crc to SegmentDirectoryContext

Extend enableParallePushProtection support in UploadSegment API

Support BOOLEAN type in Config Recommendation Engine

Add a broker metric to distinguish exception happens when acquire channel lock or when send request to server

Add pinot.minion prefix on minion configs for consistency

Enable broker service auto-discovery in Jersey framework

Timeout if waiting server channel lock takes a long time

Wire EmptySegmentPruner to routing config

Support for TIMESTAMP data type in Config Recommendation Engine

Listener TLS customization

Add consumption rate limiter for LLConsumer

Implement Real Time Mutable FST

Allow quickstart to get table files from filesystem

Add support for instant segment deletion

Add a config file to override quickstart configs

Add pinot server grpc metadata acl

Move compatibility verifier to a separate module

Move hadoop and spark ingestion libs from plugins directory to external-plugins

Add global strategy for partial upsert

Upgrade kafka to 2.8.1

Created EmptyQuickstart command

Allow SegmentPushUtil to push real-time segment

Add ignoreMerger for partial upsert

Make task timeout and concurrency configurable

Return 503 response from health check on shut down

Pinot-druid-benchmark: set the multiValueDelimiterEnabled to false when importing TPC-H data

Cleanup: Remove remaining occurrences of incubator.

Refactor segment loading logic in BaseTableDataManager to decouple it with local segment directory

Improving segment replacement/revert protocol

PinotConfigProvider interface

Enhance listSegments API to exclude the provided segments from the output

Remove outdated broker metric definitions

Add skip key for realtimeToOffline job validation

Upgrade async-http-client

Allow Reloading Segments with Multiple Threads

Ignore query options in commented out queries

Remove TableConfigCache which does not listen on ZK changes

Switch to zookeeper of helm 3.0x

Use a single react hook for table status modal

Add debug logging for real-time ingestion

Separate the exception for transform and indexing for consuming records

Disable JsonStatementOptimizer

Make index readers/loaders pluggable

Make index creator provision pluggable

Support loading plugins from multiple directories

Update helm charts to honour readinessEnabled probes flags on the Controller, Broker, Server and Minion StatefulSets

Support non-selection-only GRPC server request handler

GRPC broker request handler

Add validator for SDF

Support large payload in zk put API

Push JSON Path evaluation down to storage layer

When upserting new record, index the record before updating the upsert metadata

Add Post-Aggregation Gapfilling functionality.

Clean up deprecated fields from segment metadata

Remove deprecated method from StreamMetadataProvider

Obtain replication factor from tenant configuration in case of dimension table

Use valid bucket end time instead of segment end time for merge/rollup delay metrics

Make pinot start components command extensible

Make upsert inner segment update atomic

Clean up deprecated ZK metadata keys and methods

Add extraEnv, envFrom to statefulset help template

Make openjdk image name configurable

Add getPredicate() to PredicateEvaluator interface

Make split commit the default commit protocol

Pass Pinot connection properties from JDBC driver

Add Pinot client connection config to allow skip fail on broker response exception

Change default range index version to v2

Put thread timer measuring inside of wall clock timer measuring

Add getRevertReplaceSegmentRequest method in FileUploadDownloadClient

Add JAVA_OPTS env var in docker image

Split thread cpu time into three metrics

Add config for enabling real-time offset based consumption status checker

Add timeColumn, timeUnit and totalDocs to the json segment metadata

Set default Dockerfile CMD to -help

Add getName() to PartitionFunction interface

Support Native FST As An Index Subtype for FST Indices

Add forceCleanup option for 'startReplaceSegments' API

Add config for keystore types, switch tls to native implementation, and add authorization for server-broker tls channel

Extend FileUploadDownloadClient to send post request with json body

Fix string comparisons

Bugfix for order-by all sorted optimization

Fix dockerfile

Ensure partition function never return negative partition

Handle indexing failures without corrupting inverted indexes

Fixed broken HashCode partitioning

Fix segment replace test

Fix filtered aggregation when it is mixed with regular aggregation

Fix FST Like query benchmark to remove SQL parsing from the measurement

Do not identify function types by throwing exceptions

Fix regression bug caused by sharing TSerializer across multiple threads

Fix validation before creating a table

Check cron schedules from table configs after subscribing child changes

Disallow duplicate segment name in tar file

Fix storage quota checker NPE for Dimension Tables

Fix TraceContext NPE issue

Update gcloud libraries to fix underlying issue with api's with CMEK

Fix error handling in jsonPathArray

Fix error handling in json functions with default values

Fix controller config validation failure for customized TLS listeners

Validate the numbers of input and output files in HadoopSegmentCreationJob

Broker Side validation for the query with aggregation and col but without group by

Improve the proactive segment clean-up for REVERTED

Allow JSON forward indexes

Fix the PinotLLCRealtimeSegmentManager on segment name check

Always use smallest offset for new partitionGroups

Fix RealtimeToOfflineSegmentsTaskExecutor to handle time gap

Refine segment consistency checks during segment load

Fixes for various JDBC issues

Delete tmp- segment directories on server startup

Fix ByteArray datatype column metadata getMaxValue NPE bug and expose maxNumMultiValues

Fix the issues that Pinot upsert table's uploaded segments get deleted when a server restarts.

Fixed segment upload error return

Fix QuerySchedulerFactory to plug in custom scheduler

Fix the issue with grpc broker request handler not started correctly

Fix real-time ingestion when an entire batch of messages is filtered out

Move decode method before calling acquireSegment to avoid reference count leak

Fix semaphore issue in consuming segments

Add bootstrap mode for PinotServiceManager to avoid glitch for health check

Fix the broker routing when segment is deleted

Fix obfuscator not capturing secretkey and keytab

Fix segment merge delay metric when there is empty bucket

Fix QuickStart by adding types for invalid/missing type

Use oldest offset on newly detected partitions

Fix javadoc to compatible with jdk8 source

Handle null segment lineage ZNRecord for getSelectedSegments API

Handle fields missing in the source in ParquetNativeRecordReader

Fix the issue with HashCode partitioning function

Fix the issue with validation on table creation

Change PinotFS API's

(#8148)
(#8189)
(#8100)
(#7959)
(#7916)
(#7990)
(#7861)
(#7568)
(#8199)
(#8194)
(#8131)
(#8115)
(#8021)
(#8006)
(#7981)
(#7121)
(#7899)
(#7857)
(#7841)
(#7823)
(#7803)
(#7746)
(#8261)
(#8228)
(#8237)
(#8206)
(#8167)
(#8195)
(#8140)
(#8138)
(#8139)
(#8124)
(#8102)
(#7819)
(#8013)
(#7920)
(#7949)
(#7934)
(#7931)
(#7935)
(#7930)
(#7826)
(#7797)
(#7777)
(#7734)
(#8270)
(#8269)
(#8266)
(#8262)
(#8249)
(#8252)
(#8224)
(#8200)
(#8188)
(#8241)
(#8238)
(#8122)
(#8227)
(#8230)
(#8231)
(#8176)
(#8223)
(#8196)
(#8213)
(#8207)
(#8210)
(#8204)
(#8203)
(#8193)
(#8191)
(#7177)
(#8180)
(#8197)
(#7486)
(#8174)
(#8190)
(#8166)
(#8159)
(#8186)
(#8169)
(#8168)
(#8146)
(#8125)
(#7997)
(#8129)
(#8001)
(#8127)
(#8110)
(#8055)
(#8105)
(#8109)
(#8107)
(#8083)
(#8067)
(#8087)
(#8082)
(#6291)
(#8016)
(#8093)
(#8077)
(#8059)
(#8030)
(#8049)
(#8048)
(#7906)
(#7883)
(#8024)
(#8032)
(#7907)
(#8028)
(#7892)
(#8012)
(#8023)
(#7969)
(#7995)
(#7984)
(#7878)
(#7962)
(#7921)
(#7968)
(#7893)
(#7894)
(#7943)
(#7955)
(#7952)
(#7946)
(#7926)
(#7919)
(#7897)
(#7885)
(#7871)
(#7891)
(#7839)
(#7838)
(#7804)
(#7364)
(#7820)
(#7860)
(#7781)
(#7853)
(#7852)
(#7848)
(#7827)
(#7847)
(#7844)
(#7846)
(#7833)
(#7832)
(#7840)
(#7780)
(#7822)
(#7816)
(#7815)
(#7809)
(#7796)
(#7799)
(#7724)
(#7753)
(#7765)
(#7767)
(#7760)
(#7729)
(#7744)
(#7653)
(#7751)
(#8253)
(#8263)
(#8239)
(#8221)
(#8211)
(#8216)
(#8209)
(#8172)
(#8097)
(#8137)
(#8160)
(#8103)
(#8113)
(#8119)
(#8132)
(#8126)
(#8121)
(#8120)
(#8111)
(#8106)
(#8098)
(#7972)
(#8071)
(#8073)
(#8058)
(#8053)
(#8054)
(#8035)
(#7784)
(#7961)
(#7918)
(#7979)
(#7957)
(#7945)
(#7950)
(#7927)
(#7938)
(#7886)
(#7880)
(#7817)
(#7794)
(#7761)
(#7768)
(#7756)
(#7754)
(#7752)
(#7742)
(#8216)
(#8103)
(#8603)
fd9c58a
0.3.0
Dependency graph after introducing pinot-segment-api.