HDFS
This guide shows you how to configure HDFS for use with Pinot, including data import and deep storage.
-Dplugins.dir=/opt/pinot/plugins -Dplugins.include=pinot-hdfsexport HADOOP_HOME=/local/hadoop/
export HADOOP_VERSION=2.7.1
export HADOOP_GUAVA_VERSION=11.0.2
export HADOOP_GSON_VERSION=2.2.4
export CLASSPATH_PREFIX="${HADOOP_HOME}/share/hadoop/hdfs/hadoop-hdfs-${HADOOP_VERSION}.jar:${HADOOP_HOME}/share/hadoop/common/lib/hadoop-annotations-${HADOOP_VERSION}.jar:${HADOOP_HOME}/share/hadoop/common/lib/hadoop-auth-${HADOOP_VERSION}.jar:${HADOOP_HOME}/share/hadoop/common/hadoop-common-${HADOOP_VERSION}.jar:${HADOOP_HOME}/share/hadoop/common/lib/guava-${HADOOP_GUAVA_VERSION}.jar:${HADOOP_HOME}/share/hadoop/common/lib/gson-${HADOOP_GSON_VERSION}.jar"Push HDFS segment to Pinot Controller
Examples
Job spec
Controller config
Server config
Minion config
HDFS as deep storage
Server setup
Configuration
Executable
Controller setup
Configuration
Executable
Broker setup
Configuration
Executable
Kerberos authentication
1. Automatic authentication (recommended)
Why these properties are required
Understanding the two sets of Kerberos properties
Benefits of automatic authentication
2. Manual authentication (legacy)
Troubleshooting
HDFS FileSystem issues
Kerberos authentication issues
Error: "Failed to authenticate with Kerberos"
Error: "GSSException: No valid credentials provided"
Error: "Unable to obtain Kerberos password" or "Clock skew too great"
Error: "HDFS operation fails after running for several hours"
Verifying Kerberos configuration
Best practices
Last updated
Was this helpful?

