Improving Enterprise Data Quality
The term “data quality” may sound rather inexact to some laypersons. After all, in many areas of life and work, “quality” is a kind of personal, subjective judgment. But in terms of organizational data, “quality” has very specific meanings — and although certain dimensions of quality are subject to personal interpretation, most of them are objectively derived.
For example, metrics such as data validity, conformance, completeness, integrity, and accuracy are all quantitative metrics that can be measured at a very granular level — right down to the physical instance they occupy. Others are broader, and provide measurements regarding an entire business data element (BDE) — such as timeliness and availability to consumers of the data.
No two organizations are exactly the same, nor are the specific ways they create, share, and transform data. With that being said, there are several fundamental steps for improving data quality.
- Establish what your data quality process will be — this should always start with profiling existing data. Unfortunately, this is where most data quality teams stop. Instead, the profiling phase should be followed by a robust and comprehensive data assessment process that uses the outcomes of the profiling process to determine what outliers and anomalies exist.
- If you don’t already have data improvement tools in place, it is important to conduct experiments with the various tools available to select the one that will best fit your organization’s needs. At Data Clairvoyance, we’re essentially agnostic about these tools, because we believe they all have their pros and cons. This is why we recommend using a Design-of-Experiments (DoE) framework to ensure that your experiments result in more than just understanding how a certain technology works. A well-designed DoE framework will measure the effectiveness of various combinations of people, roles, and technologies — all in service of a defined data improvement process.
- Develop analytical mechanisms to trend aggregated quality scores into visualizations that provide control-chart like capabilities. This will ensure that your organization can become more proactive on data quality scores that are trending toward a critical deviation limit. More importantly, it provides a compelling reason for your executives to become interested in, and supportive of, your efforts.
Important tip: the name of this game is to always be finding funding — because doing data quality work can become very expensive over time. We recommend pursuing a strategy similar to that used by companies seeking venture capital. Here, of course, you’re seeking internal sources of funding, so the key is to gather recent (and relevant) data quality scores — and then share them with select executives and data element consumers. In fact, the better you are at this process, the greater the likelihood that one of those individuals will react and take action when they see a bad score (signifying data that is, for whatever reason, incomplete, misleading, or wrong) for which they are accountable. Suffice it to say they won’t want that score to remain bad — and suddenly, you have some capital to begin fixing the problems you’ve uncovered.
One overarching strategy to emphasize is that data quality tools alone will not solve all of your problems. To assume otherwise is a critical mistake, because it’s too easy to spend all your capital on a promised quick fix, but then receive little value in return. An even greater danger is that the disappointing outcomes you’re likely to get as a result usually won’t be felt until years after the euphoria of “buying the best tool” has faded.
A key to improving your data quality is to establish an Office of Data (led by a Chief Data Officer, or CDO), representing a new type of function within your organization. Because data is such a valuable commodity, your organization needs this function to serve an essential, strategic role as the “broker” of data between its creators and its users.
The Office should be tasked with data governance, metadata management, data quality and data architecture —responsibilities that today might reside across multiple functions within your organization. In addition, the Office of Data can play a vital role in steering a more systematic data improvement process.
Where to start?
Most organizations initially lack the internal skills and processes to plan and execute an enterprise-wide data quality improvement effort. To build the internal capabilities, you’ll generally need to hire or re-allocate resources, and then wait for them to mature. There are ways to accelerate that maturation, but there will certainly be a delay before they are up to speed and ready to begin producing a significant impact.
For many organizations, a better solution is to hire a firm skilled in data quality profiling and improvement. Even then, achieving an acceptable ROI is still years down the road — although the path is generally more predictable and features less risk.