Pinot provides a rich CLI to perform almost every operation on the cluster. You can execute all the commands using the pinot-admin.sh. The script is located in the bin/ directory of the Pinot binary distribution or /opt/pinot/bin in docker container.
The following commands are supported by the admin script.
Add Schema
Upload the schema configuration to controller. If their is already a schema with same name, it will be updated.
Supported Options
All the options should be prefixed with - (hyphen)
Option
Description
Add Table
Upload the table configuration to controller.
Supported Options
All the options should be prefixed with - (hyphen)
Option
Description
Add Tenant
Add a new tenant to the server
Supported Options
All the options should be prefixed with - (hyphen)
Option
Description
Check Offline Segment Intervals
Lists all the segments which have invalid time interval. Only OFFLINE segments are supported.
Supported Options
All the options should be prefixed with - (hyphen)
Option
Description
Change Num Replicas
This command changes the replicas of the table. The number of replicas are set from the latest available table config.
Supported Options
All the options should be prefixed with - (hyphen)
Option
Description
Change Table State
Enable, Disable or Drop the table available in database.
Supported Options
All the options should be prefixed with - (hyphen)
Option
Description
Create Segment
Create segment files from the input file in local filesystem.
Supported Options
All the options should be prefixed with - (hyphen)
Option
Description
Convert Pinot Segment
Convert the segment file from Pinot specific format to other data formats. Currently CSV, AVRO, JSON, and PARQUET are supported.
Supported Options
All the options should be prefixed with - (hyphen)
Option
Description
Parquet Format Notes
When converting to PARQUET format, the following features are supported:
Multi-value columns (Object[]) are automatically converted to lists
Binary data (byte[]) is wrapped in ByteBuffer for Parquet compatibility
GZIP compression is applied by default for reduced file size
Delete Cluster
Delete the cluster namespace from zookeeper.
All the options should be prefixed with - (hyphen)
Option
Description
Launch Data Ingestion Job
Run job to consume batch or streaming data and push it to Pinot.
Supported Options
All the options should be prefixed with - (hyphen)
Option
Description
Merge/Rollup Segments
Perform operations similar to the , where multiple segments can be merged based on the provided spec.
This command is mostly for debugging purpose. Use Minion Merge Rollup Task for production.
Fields within the spec file
Field
Description
Move Replica Group
Command to migrate a subset of replica group from current servers to the provided destination servers. This command is intended to be run multiple times to migrate all the replicas of a table to the destination servers (if intended).
Supported Options
All the options should be prefixed with - (hyphen)
Option
Description
Operate Cluster Config
Modify for pinot. These are the configs which are applicable to all nodes in the cluster.
Supported Options
All the options should be prefixed with - (hyphen)
Option
Description
Post Query
Execute a SQL query on the cluster.
Supported Options
All the options should be prefixed with - (hyphen)
Option
Description
Rebalance Table
Rebalance a table i.e. reassign instances and segments for a table in the cluster.
For segment reassignment, the following modes are offered:
With-downtime rebalance: the IdealState is replaced with the target segment assignment in one go and there are no guarantees around replica availability. This mode returns immediately without waiting for ExternalView to reach the target segment assignment. Disabled tables will always be rebalanced with downtime.
No-downtime rebalance: care is taken to ensure that the configured number of replicas of any segment are available (ONLINE or CONSUMING) at all times. This mode returns after ExternalView reaching the target segment assignment.
In the edge case scenarios mentioned later, if best-efforts is disabled, rebalancer will fail the rebalance because the no-downtime contract cannot be achieved, and table might end up in a middle stage. User needs to check the rebalance result, solve the issue, and run the rebalance again if necessary.
If
If the controller that handles the rebalance goes down/restarted, the rebalance isn't automatically resumed by other controllers
Supported Options
All the options should be prefixed with - (hyphen)
Option
Description
Start Broker
Start a broker instance on host
Supported Options
All the options should be prefixed with - (hyphen)
Option
Description
Start Controller
Start a controller instance on host
Supported Options
All the options should be prefixed with - (hyphen)
Option
Description
Start Server
Start a server instance on host
Supported Options
All the options should be prefixed with - (hyphen)
Option
Description
Start Service Manager
Start multiple Pinot processes with all the default configurations using a single command.
Supported Options
All the options should be prefixed with - (hyphen)
Option
Description
Show Cluster Info
Show all the available clusters namespaces along with metadata
Supported Options
All the options should be prefixed with - (hyphen)
Option
Description
Stop Process
Stop all the processes of the specified types running on the host.
Supported Options
All the options should be prefixed with - (hyphen)
Option
Description
Upload Segments
Compress and upload segment files to server.
Supported Options
All the options should be prefixed with - (hyphen)
Option
Description
Validate Config
Validate the table configs and schema present in Zookeeper.
Supported Options
All the options should be prefixed with - (hyphen)
Option
Description
Validate Segment
Compares Helix for specified table prefixes.
Supported Options
All the options should be prefixed with - (hyphen)
Option
Description
Verify Cluster State
Verify if all the tables in the cluster have same Ideal State and External View.
Supported Options
All the options should be prefixed with - (hyphen)
Option
Description
exec
If not specified, a dry run will be done but configs won't actually be uploaded.
instanceCount
total number of instances to assign to this tenant
offlineInstanceCount
(only applicable for SERVER) total number of instances which can host offline tables belonging to this tenant
realTimeInstanceCount
(only applicable for SERVER)total number of instances which can host real-time tables belonging to this tenant
exec
If not specified, a dry run will be done but configs won't actually be uploaded.
tableConfigFile
Path to
schemaFile
Path to
readerConfigFile
properties file containing the config related to the reader. See
retry
Number of retry attempts in case of failure
postCreationVerification
Set true to verify the segment files post creation.
numThreads
Number of threads to use to execute the segment creation job
csvDelimiter
delimiter to use for CSV files. only applicable to CSV
csvListDelimiter
delimiter to use for list/array in CSV files. only applicable to CSV
csvWithHeader
set to print CSV header in output file. Default is false.
Example usage: pinot-admin.sh ConvertPinotSegment -dataDir /path/to/segments -outputDir /path/to/output -outputFormat PARQUET
timeHandlerConfig
configs related to time handling, including type, startTimeMs, endTimeMs, roundBucketMs, partitionBucketMs
partitionerConfigs
list of partition related configs, including partitionerType, numPartitions, columnName, transformFunction, columnPartitionConfig
mergeType
CONCAT, ROLLUP, DEDUP
aggregationTypes
map from metric column to aggregation function type for the ROLLUP merge type
segmentConfig
configs related to the generated segments, including maxNumRecordsPerSegment, segmentNamePrefix
zkHost
zookeeper host:port string
cluster
name of the cluster inside zookeeper .
exec
set to execute the command. If unset, only a dry run will be done
best-efforts
is enabled, rebalancer will log a warning and continue the rebalance, but the no-downtime contract will not be guaranteed.
Downtime can occur in the following edge case scenarios -
Segment falls into ERROR state in ExternalView -> with best-efforts, count ERROR state as good state.
ExternalView has not converged within the maximum wait time -> with best-efforts, continue to the next stage
includeConsuming
set to reassign CONSUMING segments for real-time table (false by default)
bootstrap
set to rebalance table in bootstrap mode (regardless of minimum segment movement, reassign all segments in a round-robin fashion as if adding new segments to an empty table, false by default)
downtime
Set to allow downtime for the rebalance (false by default)
minAvailableReplicas
minimum number of replicas to keep alive during rebalance, or maximum number of replicas allowed to be unavailable if value is negative (default is 1), Only applicable if downtime is set to false
bestEfforts
set to use best-efforts to rebalance i.e. not fail the rebalance when the no-downtime contract cannot be achieved, false by default
configFileName
path to properties file containing controller configs. See for complete configuration
zkAddress
comma-separated host:port string of Zookeeper to connect
clusterName
name of the cluster to connect to. It can be thought of as a namespace inside zookeeper.
configFileName
path to properties file containing controller configs. See for complete configuration
segmentDir
directory in which to download the .tar segment files temporarily
zkAddress
comma-separated host:port string of zookeeper to connect
clusterName
name of the cluster to connect to. It can be thought of as a namespace inside zookeeper.
configFileName
path to properties file containing controller configs. See for complete configuration
bootstrapServices
list of service roles to start with default configurations. For these roles, the default configuration will be taken even if bootstrapConfig is provided.
kafka
Stop all the kafka process. The process should have been started by pinot admin script otherwise it won't be killed.
schema
if set, schemas are validated
schemaNames
space seperated list of schema names. By default, all schemas are validated
schemaFile
path to schema JSON file mentioned in table configuration.
controllerHost
controllerHost on which to send the upload requests
controllerPort
controllerPort on which to send the upload requests
If not specified, a dry run will be done but configs won't actually be uploaded.
controllerPort on which to send the upload requests
where the tenant should reside. can be BROKER or SERVER
If not specified, a dry run will be done but configs won't actually be uploaded.
can be one of enable , disable or drop
Set to true to overwrite segments of already present in the directory
set it to overwrite the files if already present in output directory
path to schema of the table for which segment should be merged
maximum number of segments to move. default is Integer.MAX_VALUE
The port on which to send the requests
set to reassign instances before reassigning segments (false by default)
name of the cluster to connect to. It can be thought of as a namespace inside zookeeper.
path to directory to store data. Default is java.io.tmpDir + PinotController
directory in which to store the data
list of Pinot config file paths. Each config file requires an extra config: pinot.service.role to indicate which service to start. The service role can be one of CONTROLLER, BROKER or SERVER
Stop all the zookeeper process. The process should have been started by pinot admin script otherwise it won't be killed.
name of the table to push the segments in
space seperated list of table names. By default, all tables are validated
timeout in seconds for the request to check the cluster state.