1:MILLISECONDS:EPOCH
transcript
and map the schema created in the previous step to the table. For batch data, we keep the tableType
as OFFLINE
data.json
to a table called foo_OFFLINE
, use below command batchConfigMapStr
can be used to pass in additional properties needed for decoding the file. For example, in case of csv, you may need to provide the delimitertranscript
, we can trigger the job using the following command-Dlog4j2.configurationFile
-Dplugins.dir=/opt/pinot/plugins
-Xmx8g -Xms4G
JAVA_OPTS
variable.-D mapreduce.map.memory.mb=8192
to set the mapper memory size when submitting the Hadoop job.spark.executor.memory
to tune the memory usage for segment creation when submitting the Spark job.