Pinot provides a rich CLI to perform almost every operation on the cluster. You can execute all the commands using the pinot-admin.sh
. The script is located in the bin/
directory of the Pinot binary distribution or /opt/pinot/bin
in docker container.
The following commands are supported by the admin script.
Upload the schema configuration to controller. If their is already a schema with same name, it will be updated.
All the options should be prefixed with -
(hyphen)
Option | Description |
---|---|
Upload the table configuration to controller.
All the options should be prefixed with -
(hyphen)
Add a new tenant to the server
All the options should be prefixed with -
(hyphen)
Lists all the segments which have invalid time interval. Only OFFLINE
segments are supported.
All the options should be prefixed with -
(hyphen)
This command changes the replicas of the table. The number of replicas are set from the latest available table config.
All the options should be prefixed with -
(hyphen)
Enable, Disable or Drop the table available in database.
All the options should be prefixed with -
(hyphen)
Create segment files from the input file in local filesystem.
All the options should be prefixed with -
(hyphen)
Convert the segment file from Pinot specific format to other data formats. Currently only CSV
, AVRO
and JSON
are supported.
All the options should be prefixed with -
(hyphen)
Delete the cluster namespace from zookeeper.
All the options should be prefixed with -
(hyphen)
Run job to consume batch or streaming data and push it to Pinot.
All the options should be prefixed with -
(hyphen)
Perform operations similar to the Minion Merge Rollup Task, where multiple segments can be merged based on the provided spec.
This command is mostly for debugging purpose. Use Minion Merge Rollup Task for production.
Command to migrate a subset of replica group from current servers to the provided destination servers. This command is intended to be run multiple times to migrate all the replicas of a table to the destination servers (if intended).
All the options should be prefixed with -
(hyphen)
Modify cluster level configs for pinot. These are the configs which are applicable to all nodes in the cluster.
All the options should be prefixed with -
(hyphen)
Execute a SQL query on the cluster.
All the options should be prefixed with -
(hyphen)
Rebalance a table i.e. reassign instances and segments for a table in the cluster.
For segment reassignment, the following modes are offered:
With-downtime rebalance
: the IdealState is replaced with the target segment assignment in one go and there are no guarantees around replica availability. This mode returns immediately without waiting for ExternalView to reach the target segment assignment. Disabled tables will always be rebalanced with downtime.
No-downtime rebalance
: care is taken to ensure that the configured number of replicas of any segment are available (ONLINE or CONSUMING) at all times. This mode returns after ExternalView reaching the target segment assignment.
In the edge case scenarios mentioned later, if best-efforts
is disabled, rebalancer will fail the rebalance because the no-downtime contract cannot be achieved, and table might end up in a middle stage. User needs to check the rebalance result, solve the issue, and run the rebalance again if necessary.
If best-efforts
is enabled, rebalancer will log a warning and continue the rebalance, but the no-downtime contract will not be guaranteed.
Downtime can occur in the following edge case scenarios -
Segment falls into ERROR state in ExternalView -> with best-efforts, count ERROR state as good state.
ExternalView has not converged within the maximum wait time -> with best-efforts, continue to the next stage
If the controller that handles the rebalance goes down/restarted, the rebalance isn't automatically resumed by other controllers
All the options should be prefixed with -
(hyphen)
Start a broker instance on host
All the options should be prefixed with -
(hyphen)
Start a controller instance on host
All the options should be prefixed with -
(hyphen)
Start a server instance on host
All the options should be prefixed with -
(hyphen)
Start multiple Pinot processes with all the default configurations using a single command.
All the options should be prefixed with -
(hyphen)
Show all the available clusters namespaces along with metadata
All the options should be prefixed with -
(hyphen)
Stop all the processes of the specified types running on the host.
All the options should be prefixed with -
(hyphen)
Compress and upload segment files to server.
All the options should be prefixed with -
(hyphen)
Validate the table configs and schema present in Zookeeper.
All the options should be prefixed with -
(hyphen)
Compares Helix Ideal state and External view for specified table prefixes.
All the options should be prefixed with -
(hyphen)
Verify if all the tables in the cluster have same Ideal State and External View.
All the options should be prefixed with -
(hyphen)
Option | Description |
---|---|
Option | Description |
---|---|
Option | Description |
---|---|
Option | Description |
---|---|
Option | Description |
---|---|
Option | Description |
---|---|
Option | Description |
---|---|
Option | Description |
---|---|
Option | Description |
---|---|
Field | Description |
---|---|
Option | Description |
---|---|
Option | Description |
---|---|
Option | Description |
---|---|
Option | Description |
---|---|
Option | Description |
---|---|
Option | Description |
---|---|
Option | Description |
---|---|
Option | Description |
---|---|
Option | Description |
---|---|
Option | Description |
---|---|
Option | Description |
---|---|
Option | Description |
---|---|
Option | Description |
---|---|
Option | Description |
---|---|
schemaFile
path to schema JSON file mentioned in table configuration.
controllerHost
controllerHost on which to send the upload requests
controllerPort
controllerPort on which to send the upload requests
exec
If not specified, a dry run will be done but configs won't actually be uploaded.
tableConfigFile
path to JSON file containing Table configuration.
schemaFile
path to schema JSON file mentioned in table configuration.
controllerHost
controllerHost on which to send the upload requests
controllerPort
controllerPort on which to send the upload requests
exec
If not specified, a dry run will be done but configs won't actually be uploaded.
controllerHost
controllerHost on which to send the upload requests
controllerPort
controllerPort on which to send the upload requests
name
name of the tenant
role
where the tenant should reside. can be BROKER
or SERVER
instanceCount
total number of instances to assign to this tenant
offlineInstanceCount
(only applicable for SERVER
) total number of instances which can host offline tables belonging to this tenant
realTimeInstanceCount
(only applicable for SERVER
)total number of instances which can host real-time tables belonging to this tenant
exec
If not specified, a dry run will be done but configs won't actually be uploaded.
zkAddress
comma-separated host:port string of Zookeeper to connect
clusterName
name of the cluster to connect to. It can be thought of as a namespace inside zookeeper.
tableName
Comma separated list of tables to check for invalid segment intervals. By default all tables are checked.
zkAddress
comma-separated host:port string of zookeeper to connect
clusterName
name of the cluster to connect to. It can be thought of as a namespace inside zookeeper.
tableName
name of the table on which to perform operation
exec
If not specified, a dry run will be done but configs won't actually be uploaded.
controllerHost
controllerHost on which to send the upload requests
controllerPort
controllerPort on which to send the upload requests
tableName
name of the table to modify
state
can be one of enable
, disable
or drop
dataDir
Directory containing input files
format
Input data formats. See Input formats for all the supported formats
outDir
Local output directory to publish the segments
overwrite
Set to true
to overwrite segments of already present in the directory
tableConfigFile
Path to Table Config
schemaFile
Path to Schema Config
readerConfigFile
properties file containing the config related to the reader. See Input formats
retry
Number of retry attempts in case of failure
postCreationVerification
Set true
to verify the segment files post creation.
numThreads
Number of threads to use to execute the segment creation job
dataDir
directory containing the segment files. Only local filePaths are supported
outputDir
directory to put the converted segment files in.
outputFormat
format to output the files in. Can be one of CSV
, AVRO
or JSON
overwrite
set it to overwrite the files if already present in output directory
csvDelimiter
delimiter to use for CSV files. only applicable to CSV
csvListDelimiter
delimiter to use for list/array in CSV files. only applicable to CSV
csvWithHeader
set to print CSV header in output file. Default is false
.
clusterName
name of the cluster to delete
zkAddress
Comma separated host:port list of zookeeper from which to delete the cluster namespace
jobSpecFile
Path to job spec file. Only local file paths are supported
propertyFile
Path to properties file. This file can contain properties related to ingestion job or template paramaters
values
list of string containing the values to replace template parameters with
inputSegmentsDir
directory that contains all the input segment files or directories to be merged
outputSegmentsDir
directory in which merged segment file should be put
tableConfigFile
path to table config for which segments are to be merged
schemaFile
path to schema of the table for which segment should be merged
timeHandlerConfig
configs related to time handling, including type
, startTimeMs
, endTimeMs
, roundBucketMs
, partitionBucketMs
partitionerConfigs
list of partition related configs, including partitionerType
, numPartitions
, columnName
, transformFunction
, columnPartitionConfig
mergeType
CONCAT
, ROLLUP
, DEDUP
aggregationTypes
map from metric column to aggregation function type for the ROLLUP
merge type
segmentConfig
configs related to the generated segments, including maxNumRecordsPerSegment
, segmentNamePrefix
srcHosts
path of the file with all the source hosts or comma-separated list of hostnames
destHostsFile
path of the file with all the destination hosts
tableName
name of the table for which replica group is to be moved. Supports only OFFLINE
tables currently
maxSegmentsToMove
maximum number of segments to move. default is Integer.MAX_VALUE
zkHost
zookeeper host:port string
cluster
name of the cluster inside zookeeper .
exec
set to execute the command. If unset, only a dry run will be done
operation
Type of operation to perform.
Can be one of GET, ADD, UPDATE or DELETE
config
The config on which operation should be performed. In case of ADD or UPDATE, the config value is provided after =
controllerHost
The host on which to send the request
controllerPort
The port on which to send the requests
brokerHost
broker host to execute the query on
brokerPort
broker port to execute the query on
queryType
can be one of sql
or pql
(deprecated).
query
SQL query to execute
zkAddress
comma-separated host:port string of zookeeper to connect
clusterName
name of the cluster to connect to. It can be thought of as a namespace inside zookeeper.
tableName
name of the table on which to perform operation
reassignInstances
set to reassign instances before reassigning segments (false
by default)
includeConsuming
set to reassign CONSUMING
segments for real-time table (false
by default)
bootstrap
set to rebalance table in bootstrap mode (regardless of minimum segment movement, reassign all segments in a round-robin fashion as if adding new segments to an empty table, false
by default)
downtime
Set to allow downtime for the rebalance (false
by default)
minAvailableReplicas
minimum number of replicas to keep alive during rebalance, or maximum number of replicas allowed to be unavailable if value is negative (default is 1), Only applicable if downtime is set to false
bestEfforts
set to use best-efforts to rebalance i.e. not fail the rebalance when the no-downtime contract cannot be achieved, false
by default
brokerHost
hostname of the instance on which to run the broker
brokerPort
port on which the broker should listen. Default 8099.
zkAddress
comma-separated host:port string of Zookeeper to connect
clusterName
name of the cluster to connect to. It can be thought of as a namespace inside zookeeper.
configFileName
path to properties file containing controller configs. See Broker for complete configuration
controllerMode
Should be one of dual
, pinot_only
or helix_only
. Default is dual
controllerHost
hostname of the instance on which to run the controller
controllerPort
port on which the controller should listen. Default 9000.
dataDir
path to directory to store data. Default is java.io.tmpDir
+ PinotController
zkAddress
comma-separated host:port string of Zookeeper to connect
clusterName
name of the cluster to connect to. It can be thought of as a namespace inside zookeeper.
configFileName
path to properties file containing controller configs. See Controller for complete configuration
serverHost
hostname of the instance on which to run the broker
serverPort
port on which the broker should listen. Default 8099.
serverAdminPort
port on which admin API should be available. Default it 8097
dataDir
directory in which to store the data
segmentDir
directory in which to download the .tar segment files temporarily
zkAddress
comma-separated host:port string of zookeeper to connect
clusterName
name of the cluster to connect to. It can be thought of as a namespace inside zookeeper.
configFileName
path to properties file containing controller configs. See Server for complete configuration
zkAddress
comma-separated host:port string of zookeeper to connect
clusterName
name of the cluster to connect to. It can be thought of as a namespace inside zookeeper.
port
set to -1 to disable, 0 to run service manager on any available port
bootstrapConfigPaths
list of Pinot config file paths. Each config file requires an extra config: pinot.service.role
to indicate which service to start. The service role can be one of CONTROLLER
, BROKER
or SERVER
bootstrapServices
list of service roles to start with default configurations. For these roles, the default configuration will be taken even if bootstrapConfig is provided.
zkAddress
comma-separated host:port string of zookeeper to connect
clusterName
name of the cluster to connect to. It can be thought of as a namespace inside zookeeper.
tables
tags
controller
Stop all the controller processes
broker
Stop all the broker processes
server
Stop all the server processes
zookeeper
Stop all the zookeeper process. The process should have been started by pinot admin script otherwise it won't be killed.
kafka
Stop all the kafka process. The process should have been started by pinot admin script otherwise it won't be killed.
controllerHost
hostname or ip of the controller
controllerPort
port of the controller
segmentDir
local directory containing segment files
tableName
name of the table to push the segments in
zkAddress
comma-separated host:port string of Zookeeper to connect
clusterName
name of the cluster to connect to. It can be thought of as a namespace inside zookeeper.
tableConfig
if set, table configs are validated
tableNames
space seperated list of table names. By default, all tables are validated
schema
if set, schemas are validated
schemaNames
space seperated list of schema names. By default, all schemas are validated
zkAddress
comma-separated host:port string of Zookeeper to connect
clusterName
name of the cluster to connect to. It can be thought of as a namespace inside zookeeper.
tablePrefix
prefix of the table names for which the validation should be done
zkAddress
comma-separated host:port string of Zookeeper to connect
clusterName
name of the cluster to connect to. It can be thought of as a namespace inside zookeeper.
tableName
name of the table for which the validation should be done. By default, all tables are verified.
timeoutSec
timeout in seconds for the request to check the cluster state.