AmbySoft.com

Ambysoft Logo

Current State of Data Quality: 2018 Open Research

How to Measure Anything This open research into the current state of data quality was performed during the month of April 2018 and there was 99 respondents. The survey was advertised on Twitter (@scottwambler), on the LinkedIn Disciplined Agile Discussion Forum, and on the Ambysoft IT Surveys page.

The Survey Results

Here are some interesting findings:

  1. Only 88% of respondents considered data to be a corporate asset. This is down from 96% in 2006.
  2. 93% of respondents indicated that their organization had data quality problems
  3. Of the organizations who had data quality problems, 27% had no strategy to fix them and 29% said they hope it won’t get worse (it will due to entropy).
  4. Many data professionals have still not become adept at modern data quality techniques such as database refactoring and database testing.
  5. It is very common for application developers to use existing legacy data.

 

Downloads

Survey questions

The Survey

Survey Data File

Raw Data

Survey Presentation

Summary Presentation

 

What You May Do With This Information

You may use this data as you see fit, but may not sell it in whole or in part. You may publish summaries of the findings, but if you do so you must reference the survey accordingly (include the name and the URL to this page). Feel free to contact me with questions. Better yet, if you publish, please let me know so I can link to your work.

 

Discussion of the Results

    1. It is very difficult to survey data professionals compared with others in the IT industry. This has been the case since I first started running these studies in 2006.
    2. There are significant data quality problems within many organizations, yet many organizations do not have a viable strategy for addressing them.
    3. The earlier, and more often, that you test your database in the development lifecycle, the greater the data quality.
    4. A collaborative approach to data standards/guidelines is more effective than a command-and-control approach, which in turn, is better than no approach at all.
    5. A large percentage of organizations struggle to evolve their database schema in a timely manner, thereby reducing their competitiveness.
    6. This survey suffers from the fundamental challenges faced by all surveys.

 

Why Share This Much Information?

I’m sharing the results, and in particular the source data, of my surveys for several reasons:

  1. Other people can do a much better job of analysis than I can. If they publish online, I am more than happy to include links to their articles/papers.
  2. Once I’ve published my column summarizing the data in DDJ, I really don’t have any reason not to share the information.
  3. I think that it’s a good thing to do and I invite others to do the same.