Member-only story
Addressing the What, Why and How behind CDC and SCDs
Every Data Engineer’s support system
Imagine you’re managing a large-scale retail chain’s data warehouse. Products are updated, customer information evolves, prices change, and promotions fluctuate. Capturing these changes efficiently and ensuring your analytics reflects reality is crucial.
That’s where Change Data Capture (CDC) and Slowly Changing Dimensions (SCD) step in — not just as buzzwords but as pillars of a reliable data architecture.
Let’s address the elephant in the room!
What is CDC ?
CDC (Change Data Capture) is a technique to identify and capture changes (INSERT, UPDATE, DELETE) in the source system (from where the data originates) and deliver only the changed data to downstream systems (target or destination).
- It helps to avoid entire table scans repeatedly.
- It’s important for real-time or near-real-time replication.
What is SCD ?
SCD (Slowly Changing Dimensions) refers to handling changes in dimension tables (e.g. customer or product info) over time. There are different types of SCDs depending on how we want to track changes:
- SCD Type 0: Retains original data, with no…