1 of 1

Ingestion Job Spec

The ingestion job spec is used while generating, running, and pushing segments from the input files.

The Job spec can be in either YAML or JSON format (0.5.0 onwards). Property names remain the same in both formats.

To use the JSON format, add the propertyjob-spec-format=jsonin the properties file while launching the ingestion job. The properties file can be passed as follows

The following configurations are supported by Pinot

Top Level Spec

Example

Execution Framework Spec

The configs specify the execution framework to use to ingest data. Check out for configs related to all the supported frameworks

Example

Pinot FS Spec

Table Spec

Table spec is used to specify the table in which data should be populated along with schema.

Example

Record Reader Spec

Segment Name Generator Spec

Example

Pinot Cluster Spec

Example

Push Job Spec

Example

executionFrameworkSpec:
  name: 'spark'
  segmentGenerationJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.spark.SparkSegmentGenerationJobRunner'
  segmentTarPushJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.spark.SparkSegmentTarPushJobRunner'
  segmentUriPushJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.spark.SparkSegmentUriPushJobRunner'
  segmentMetadataPushJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.spark.SparkSegmentMetadataPushJobRunner'
  extraConfigs:
    stagingDir: hdfs://examples/batch/airlineStats/staging

# Recommended to set jobType to SegmentCreationAndMetadataPush for production environments where Pinot Deep Store is configured
jobType: SegmentCreationAndTarPush

inputDirURI: 'examples/batch/airlineStats/rawdata'
includeFileNamePattern: 'glob:**/*.avro'
outputDirURI: 'hdfs:///examples/batch/airlineStats/segments'
overwriteOutput: true
pinotFSSpecs:
  - scheme: hdfs
    className: org.apache.pinot.plugin.filesystem.HadoopPinotFS
  - scheme: file
    className: org.apache.pinot.spi.filesystem.LocalPinotFS 
recordReaderSpec:
  className: 'org.apache.pinot.plugin.inputformat.avro.AvroRecordReader'
tableSpec:
  tableName: 'airlineStats'
  schemaURI: 'http://localhost:9000/tables/airlineStats/schema'
  tableConfigURI: 'http://localhost:9000/tables/airlineStats'
segmentNameGeneratorSpec:
  type: normalizedDate
  configs:
    segment.name.prefix: 'airlineStats_batch'
    exclude.sequence.id: true
pinotClusterSpecs:
  - controllerURI: 'http://localhost:9000'
pushJobSpec:
  pushParallelism: 2
  pushAttempts: 2
  pushRetryIntervalMillis: 1000

executionFrameworkSpec:
    name: 'standalone'
    segmentGenerationJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentGenerationJobRunner'
    segmentTarPushJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentTarPushJobRunner'
    segmentUriPushJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentUriPushJobRunner'
    segmentMetadataPushJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentMetadataPushJobRunner'

Ingestion Job Spec

hashtagTop Level Spec

hashtagExample

hashtagExecution Framework Spec

hashtagExample

hashtagPinot FS Spec

hashtagTable Spec

hashtagExample

hashtagRecord Reader Spec

hashtagSegment Name Generator Spec

hashtagExample

hashtagPinot Cluster Spec

hashtagExample

hashtagPush Job Spec

hashtagExample

Ingestion Job Spec

hashtagTop Level Spec

hashtagExample

hashtagExecution Framework Spec

hashtagExample

hashtagPinot FS Spec

hashtagTable Spec

hashtagExample

hashtagRecord Reader Spec

hashtagSegment Name Generator Spec

hashtagExample

hashtagPinot Cluster Spec

hashtagExample

hashtagPush Job Spec

hashtagExample

Top Level Spec

Example

Execution Framework Spec

Example

Pinot FS Spec

Table Spec

Example

Record Reader Spec

Segment Name Generator Spec

Example

Pinot Cluster Spec

Example

Push Job Spec

Example

Top Level Spec

Example

Execution Framework Spec

Example

Pinot FS Spec

Table Spec

Example

Record Reader Spec

Segment Name Generator Spec

Example

Pinot Cluster Spec

Example

Push Job Spec

Example