BoltPipeline logo

SCD Types Explained

Slowly Changing Dimensions define how a table handles updates. BoltPipeline enforces your SCD policy automatically — you just write SQL.

Background

What is a Slowly Changing Dimension?

In data warehousing, a Slowly Changing Dimension (SCD) describes how a dimension table handles changes to its source data over time. The three questions every dimension designer must answer are:

Do you overwrite?

Replace the old value with the new one — no history preserved.

Do you keep history?

Add a new row for each change, keeping all previous versions.

Do you lock?

Reject all changes — the original value is the correct one forever.

BoltPipeline supports three SCD types — 0, 1, and 2 — and enforces whichever policy you set for each step. You do not need to write merge logic, hash comparisons, or end-dating SQL. BoltPipeline handles all of it automatically during pipeline execution.

0

SCD Type 0 — Fixed

Rows are written once and never updated

SCD Type 0 is for reference data that must never change after it is first loaded. BoltPipeline will reject any attempt to update or delete rows in a Type 0 table. The original insert is the authoritative record.

Use it for

  • Country and currency code tables
  • Product category hierarchies that never restructure
  • Static mapping and lookup tables
  • Any reference data where the source of truth is the original load

Avoid when

  • Source corrections need to propagate to the warehouse
  • Rows need to be soft-deleted or marked inactive
  • You need to reflect real-world changes over time
1

SCD Type 1 — Overwrite

Always reflects the latest value; no history preserved

SCD Type 1 is the most common type. When a source record changes, BoltPipeline updates the existing row in place. The previous value is replaced — there is no historical trail. Use this when only the current state of a dimension matters.

What happens when a customer changes their address

Before update

customer_id: C001
address: 123 Old St
region: East

After update (SCD 1)

customer_id: C001
address: 456 New Ave
region: West

The old address is gone. If you need to know the customer's address last quarter, you cannot — use SCD 2 for that.

Use it for

  • Customer address, email, phone number (current only)
  • Employee department or title (current only)
  • Product price or status (latest value)
  • Any dimension where only “right now” matters

Avoid when

  • Reports need to reflect “state as of” a past date
  • Regulators require an audit trail of changes
  • Historical analysis depends on the old values
2

SCD Type 2 — Full History

Every version of a row is preserved with effective date ranges

SCD Type 2 is used when you need a full audit trail of every change to a dimension over time. When a source record changes, BoltPipeline closes the current row (recording when it became inactive) and inserts a new active row. Each row has an effective date range, letting you query the state of any dimension at any point in history.

Same customer, three versions tracked

customer_idregioneffective_fromeffective_toactive
C001East2022-01-012023-06-14No
C001West2023-06-152024-11-02No
C001Central2024-11-03Yes

BoltPipeline manages effective_from, effective_to, and active flag automatically. You just write the SELECT.

Use it for

  • Customer lifetime value and cohort analysis
  • Regulatory and compliance reporting
  • Historical “as of” snapshots for any date
  • Employee tenure and career progression

Consider trade-offs when

  • The table is very high-volume and changes extremely frequently
  • BI queries always want current state only — SCD 1 is simpler
  • Storage cost is a hard constraint

Decision guide

Choosing the right SCD type

QuestionType 0Type 1Type 2
Should the original value be preserved forever?YesNoNo
Does the table change frequently?NoYesYes
Do you need to query historical states?NoNoYes
Is audit trail or compliance required?NoNoYes
Are you optimizing for storage / simplicity?YesYesNo

Turn SQL into Production-Ready Data Pipelines — Faster and Safer

SQL-first pipelines, validated and governed — executed directly inside your database.

No new DSLs. No fragile orchestration. Just SQL with built-in validation, lineage, and governance.