Insights & perspectives

Understanding the 9 Dimensions of Data Quality

Table of content

The quality of a dataset is defined by the extent to which its characteristics meet a set of predefined requirements. Measuring data quality therefore fundamentally consists of making comparisons: comparing what exists with what should exist.

In its Data Quality Management chapter, the latest edition of the DMBOK presents nine dimensions used to assess data quality. While these dimensions are not normative, they are widely recognized by data management practitioners and offer a more practical approach than those proposed in standards such as ISO 8000, ISO 25012 or the Strong-Wang model. The following walkthrough illustrates what is being compared in each case.

1. Validity

Validity assesses whether the data itself, or specific characteristics derived from it, comply with predefined standards — for example data type (numeric, alphabetic) or data format (number of characters, character positions). The standards act as an authorized domain of values.

  • Date formats may only be accepted as DD/MM/YYYY or MM/DD/YYYY: the characteristic assessed is the format, compared against the authorized domain.
  • A French SIRET number is valid if it exists in the official SIRENE registry: the raw data is compared against the registry of all officially registered numbers.

2. Completeness

Completeness measures whether all required data is present. According to the DMBOK, it can be evaluated at three levels:

  • Column / field: are all mandatory fields populated?
  • Row / record: for a given record, are all required fields completed?
  • Table / dataset: are all expected records present?

Completeness therefore compares actual completion against expected completion.

3. Consistency

Unlike validity and completeness, which compare data against external references, consistency is an inter-data concept. It assesses whether similar data is represented and coded in a uniform way. A Name column containing “John Smith”, “Smith John” and “Smith” would be considered inconsistent. The same applies to physical measurements expressed in different units without indication. Note that this reflects the DMBOK 2024 perspective and is not universally shared; the DAMA France / ISACA-AFAI working group (2022) provides a definition closer to what the DMBOK calls integrity.

4. Integrity

Like consistency, integrity is an inter-data concept. It evaluates whether relationships between data elements comply with predefined rules. The most common example is referential integrity: if an invoice references a customer, that customer must exist in the customer master data repository. Invoices linked to non-existent customers are referred to as orphan records. In its Data Security chapter, the DMBOK also defines integrity as data being whole and protected from unauthorized alteration, addition or deletion.

5. Timeliness

Timeliness can be understood as a delay, typically expressed as “D + n”. The DMBOK defines it as the expected delay between the moment data is collected or updated and the moment it should become available to users or consuming systems. It is useful to distinguish expected timeliness from actual timeliness: a pipeline designed to provide data at D+1 may, due to bottlenecks and excessive volumes, actually deliver at D+2.

6. Currency

Unlike timeliness, which is event-driven and dynamic, currency (or recency) is a static concept. It measures the gap between the current moment and the date of the latest data update. High timeliness does not necessarily imply high currency, and vice versa. Inventory updated once per year on December 31st may be delivered immediately (excellent timeliness) yet be eleven months old by November (poor currency).

7. Reasonableness

The final three dimensions involve comparisons between data and the real world. Reasonableness measures how well data aligns with expectations based on real-world knowledge or statistical patterns. A 19-year-old enrolled in a fourth-grade class would likely be considered unreasonable. Reasonableness is close to validity, but expectations are more flexible than standards: invalid data is generally rejected, whereas unreasonable data raises questions and may lead to a revision of the expectations themselves.

8. Uniqueness

Uniqueness ensures that a real-world object is represented only once within a dataset, comparing real-world entities rather than merely data values. Duplicate detection is only one aspect; uniqueness also involves near-duplicate detection (records sharing most but not all attributes) and the investigation of duplicate primary keys, which also represent an integrity issue.

9. Accuracy

Accuracy is perhaps the most intuitive dimension, but also one of the most costly to measure. It assesses whether data correctly represents a real-world object — comparing two fundamentally different things: data and reality. Confirming accuracy often requires reproducing the original collection process (inventory audits are a typical example). When direct verification is too costly, accuracy is often inferred from other dimensions, particularly consistency with trusted reference sources.

An overview

These nine dimensions highlight the diversity of elements involved in measuring data quality: the data itself, data standards (types, formats), consistency rules between data elements, completeness requirements, collection and update dates, expected data values, and real-world objects.

The T&S Data Quality AI solution enables organizations to assess data quality across all these dimensions while minimizing prerequisite information. Starting solely from existing datasets, it can automatically infer expected formats and data types, expected relationships between fields, field completion requirements and expected value distributions. Some elements cannot be inferred from the data alone — such as collection dates, update dates or real-world objects — but when available through external sources (for example IoT sensors or connected devices), they can be incorporated to provide a comprehensive view of data quality.

Share :

Sign up for updates

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
We respect your privacy. Your data is safe and will never be sold to third parties.