Where do I fix my dirty data?

There are a lot of decisions to make in our industry in data. If you look at the
responsibilities of the Chief Data Officer, one of the predominant ones is data quality. One of the questions we often hear when we are talking with companies is “Where do we clean the data?”. It is a sound question, because the answer is not simple nor is it a universal answer. I can summarise that a company has 3 choices. Firstly, to clean in the source systems, secondly in a universal and combined hub and finally, to not clean at all. Those that chose option 3 will not be around for too much longer.

"The CDO must determine the company’s current data quality and maturity levels – of which there are five. (1) Uncertainty, which typically involves the organization stumbling over data defects as programs crash and employees complain. There’s no proactive improvement process in place. (2) Awakening, during which a few individuals acknowledge the dirty data and try to incorporate quality in their projects before formal enterprise-wide support arrives. (3) Enlightenment is when the organization starts to address the root causes of dirty data through program edits and data quality training. A data quality group usually emerges here. (4) Wisdom arrives as the organization proactively works on preventing future data defects, and data quality incentives arrive. (5) Certainty emerges as the organization shifts to an optimization cycle – continuously monitoring and improving its data defect-prevention process.”

This quote from a prominent analyst firm is a nice look into what the market sees with the responsibilities for the CDO in the Data Quality field.


Please click below to read the full whitepaper.