Tiered Storage
Tiered storage allows you to split your server storage into multiple tiers. All the tiers can use different filesystem to hold the data. Tiered storage can be used to optimise the cost to latency tradeoff in production Pinot systems.
Some example scenarios in which tiered storage can be used -
Tables with very long retention (more than 2 years) but most frequently queries are performed on the recent data.
Reduce storage cost for older data while tolerating slightly higher latencies In order to optimize for low latency, we often recommend using high performance SSDs. But if such a use case has 2 years of data, and need the high performance only when querying 1 month of data, it might become desirable to keep only the recent time ranges on SSDs, and keep the less frequently queried ones on cheaper nodes such as HDDs or a DFS such as S3.
The data age based tiers is just one of the examples. The logic to split data into tiers may change depending on the use case.
Tier Config
You can configured tiered storage by setting the tieredConfigs
key in your table config json.
Example
In this example, the table uses servers tagged with base_OFFLINE
. We have created two tiers of Pinot servers, tagged with tier_a_OFFLINE
and tier_b_OFFLINE
. Segments older than 7 days will move from base_OFFLINE
to tier_a_OFFLINE
, and segments older than 15 days will move to tier_b_OFFLINE
.
Following properties are supported under tierConfigs
-
name | Name of the tier. Every tier in the tierConfigs list must have a unique name |
segmentSelectorType | The strategy used for selecting segments for tiers. The only supported strategy as of now is |
segmentAge | This property is required when |
storageType | The type of storage. The only supported type is |
serverTag | This property is required when |
How does data move from one tenant to another?
On adding tier config, a periodic task on the pinot-controller called "SegmentRelocator" will move segments from one tenant to another, as and when the segment crosses the segment age.
This periodic task runs every hour by default. You can configure this frequency by setting the config with any period string (60s, 2h, 5d)
This job can also be triggered manually
Under the hood, this job runs a rebalance. So you can achieve the same effect as a manual trigger by running a rebalance
Last updated