HDFS as Deep Storage
This guide shows how to set up HDFS as deep storage for a Pinot segment.
Server Setup
Configuration
pinot.server.instance.enable.split.commit=true
pinot.server.storage.factory.class.hdfs=org.apache.pinot.plugin.filesystem.HadoopPinotFS
pinot.server.storage.factory.hdfs.hadoop.conf.path=/path/to/hadoop/conf/directory/
# For server, instructing the HadoopPinotFS plugin to use the specified keytab and principal when accessing HDFS paths
pinot.server.storage.factory.hdfs.hadoop.kerberos.principle=<hdfs-principle>
pinot.server.storage.factory.hdfs.hadoop.kerberos.keytab=<hdfs-keytab>
pinot.server.segment.fetcher.protocols=file,http,hdfs
pinot.server.segment.fetcher.hdfs.class=org.apache.pinot.common.utils.fetcher.PinotFSSegmentFetcher
pinot.server.segment.fetcher.hdfs.hadoop.kerberos.principle=<your kerberos principal>
pinot.server.segment.fetcher.hdfs.hadoop.kerberos.keytab=<your kerberos keytab>
pinot.set.instance.id.to.hostname=true
pinot.server.instance.dataDir=/path/in/local/filesystem/for/pinot/data/server/index
pinot.server.instance.segmentTarDir=/path/in/local/filesystem/for/pinot/data/server/segment
pinot.server.grpc.enable=true
pinot.server.grpc.port=8090Executable
Controller Setup
Configuration
Executable
Broker Setup
Configuration
Executable
Kerberos Authentication
1. Automatic Authentication (Recommended)
Why These Properties Are Required
Understanding the Two Sets of Kerberos Properties
Benefits of Automatic Authentication
2. Manual Authentication (Legacy)
Troubleshooting
HDFS FileSystem Issues
Kerberos Authentication Issues
Error: "Failed to authenticate with Kerberos"
Error: "GSSException: No valid credentials provided"
Error: "Unable to obtain Kerberos password" or "Clock skew too great"
Error: "HDFS operation fails after running for several hours"
Verifying Kerberos Configuration
Best Practices
Last updated
Was this helpful?

