CDC / Upsert Pipeline
End-to-end guide for keeping Pinot in sync with a transactional database using CDC and upserts.
When to use this pattern
Architecture sketch
OLTP DB ──▶ Debezium ──▶ Kafka topic ──▶ Pinot REALTIME table (upsert)
(WAL) (CDC) (per-table) │
┌────────┴────────┐
│ Servers with │
│ primary key map │
└─────────────────┘Schema
Table configuration
Configuration highlights
Setting
Why
Handling deletes
Partial upsert
Debezium configuration tips
Query patterns
Current state aggregation
Point lookup
Time-range analysis over mutable data
Segment compaction
Operational checklist
Before go-live
Monitoring
Common pitfalls
Pitfall
Fix
Further reading
Last updated
Was this helpful?

