githubEdit

Minion Task Plugin

Write a custom Minion task plugin with a task generator and task executor.

Minion tasks execute background maintenance and data processing jobs in Pinot. Each task type has two components: a Task Generator (runs on the Controller) that creates task configurations, and a Task Executor (runs on the Minion) that performs the work.

Architecture

Controller                          Minion
┌────────────────────┐             ┌────────────────────┐
│  PinotTaskGenerator│  ─ Helix ─> │ PinotTaskExecutor  │
│  - generateTasks() │  schedules  │ - executeTask()    │
│  - getTaskType()   │             │ - cancel()         │
└────────────────────┘             └────────────────────┘
  1. The Controller runs the Task Generator on a schedule or via ad-hoc API call

  2. The Generator creates PinotTaskConfig objects describing the work

  3. The Controller schedules tasks via Helix

  4. Minion instances pick up tasks based on tag/routing

  5. The Minion executes the task using the Task Executor

  6. Results are returned to the Controller

Step 1: Implement the Task Executor

The executor runs on the Minion and performs the actual work.

Task Executor Factory

Task Executor

circle-info

The executor factory class must be annotated with @TaskExecutorFactory(enabled = true) for auto-registration. The class must be in a package matching org.apache.pinot.*.plugin.minion.tasks.*.

Step 2: Implement the Task Generator

The generator runs on the Controller and decides when and how to create tasks.

Optional Generator Overrides

Method
Default
Description

getTaskTimeoutMs(String minionTag)

3,600,000 (1 hour)

Task timeout in milliseconds

getNumConcurrentTasksPerInstance(String minionTag)

1

Max concurrent tasks per Minion

getMaxAttemptsPerTask(String minionTag)

1 (no retry)

Max retry attempts

getMinionInstanceTag(TableConfig tableConfig)

UNTAGGED_MINION_INSTANCE

Tag for routing tasks to specific Minion instances

validateTaskConfigs(TableConfig, Schema, Map)

no-op

Validate task-specific configs at table creation time

Step 3: Enable the Task in Table Config

Add the task configuration to the table config:

The schedule property uses a CRON expression to define when the task generator runs. Without it, the task can only be triggered via the Controller API.

Step 4: Trigger Tasks via API

To manually trigger task generation:

Built-in Task Types

Pinot includes several built-in Minion tasks:

Task Type
Description

RealtimeToOfflineSegmentsTask

Converts realtime segments to offline segments

UpsertCompactionTask

Compacts segments in upsert-enabled tables

UpsertCompactMergeTask

Merges and compacts upsert segments

PurgeTask

Purges records matching a purge function

MergeRollupTask

Merges and rolls up small segments

SegmentGenerationAndPushTask

Generates and pushes segments from external data

RefreshSegmentTask

Refreshes segments from deep store

For more details on built-in tasks, see Minion Merge Rollup Task and Upsert Compaction Task.

Last updated

Was this helpful?