Dualo
Data Governance

Data Quality

The dimensions (accuracy, completeness, consistency, timeliness, validity, uniqueness) that make data trustworthy — and how to measure and fix them.

1 min read

= how much you can trust what the data says. A dashboard with beautiful visuals but wrong numbers is worse than no dashboard — it produces confident bad decisions.

Six dimensions define quality: accuracy (does it reflect reality?), completeness (no missing values where there shouldn't be), consistency (the same customer in CRM and billing shows the same address), timeliness (fresh enough for the use case), validity (formats match rules: email looks like an email), uniqueness (no duplicates).

Bad data has a cost. Gartner estimates the average company loses $12.9M/year to poor data quality — duplicate marketing sends, reconciliation hell at month-end, support agents looking at stale customer info. You probably can't measure the cost precisely, but it's real.

The fix is a loop: (measure what you have), define expectations (what's acceptable), monitor (automated checks that flag breaches), remediate (fix the root cause, not just the bad row). Tools: Great Expectations, dbt tests, Soda, Monte Carlo, Elementary.

Quality is a responsibility, not a project. Upstream fixes (validate at ingest) beat downstream cleaning every time — cleaning after the fact is expensive, never 100%, and creates divergence between the source and the 'clean' version.

Grounded on https://www.dama.org/

Next up

Data Lineage

The map of where data comes from and where it goes: upstream sources, transformations, downstream consumers. Essential for trust, impact analysis, and compliance.