transcriptand map the schema created in the previous step to the table. For batch data, we keep the
data.jsonto a table called
foo_OFFLINE, use below command
batchConfigMapStrcan be used to pass in additional properties needed for decoding the file. For example, in case of csv, you may need to provide the delimiter
transcript, we can trigger the job using the following command
-D mapreduce.map.memory.mb=8192to set the mapper memory size when submitting the Hadoop job.
spark.executor.memoryto tune the memory usage for segment creation when submitting the Spark job.