Ambysoft Logo

Current State of Data Quality: 2016 Open Research

How to Measure Anything This open research into the current state of data quality was performed during the month of June 2016 and there was 54 respondents. The survey was advertised on Twitter (@scottwambler), on the LinkedIn Disciplined Agile Discussion Forum, and on the Ambysoft IT Surveys page.

The Survey Results

The following are some of the results of the survey:

  1. Only 87% of respondents work in organizations where data is considered a corporate asset.
  2. 92% of respondents indicated that their organization had data quality problems.
  3. Of the respondents with data quality problems, 38% worked in organizations without a strategy to address the problems, 24% hoped they don’t make it worse, and 6% hoped to rewrite and release all applications an once.
  4. Only 24% of respondents indicated that developers are provided training in data skills
  5. Only 34% of respondents indicated that data professionals receive training in development skills
  6. 52% of respondents indication that application development teams choose to avoid their organization’s data group



The Survey Questions

Raw Data

Summary Presentation


What You May Do With This Information

You may use this data as you see fit, but may not sell it in whole or in part. You may publish summaries of the findings, but if you do so you must reference the survey accordingly (include the name and the URL to this page). Feel free to contact me  with questions. Better yet, if you publish, please let me know so I can link to your work.


Discussion of the Results

  1. It is very difficult to survey data professionals compared with others in the IT industry. This has been the case since I first started running these studies in 2006.
  2. There are significant data quality problems within many organizations, yet many organizations do not have a viable strategy for addressing them.
  3. The earlier, and more often, that you test your database in the development lifecycle, the greater the data quality.
  4. A collaborative approach to data standards/guidelines is more effective than a command-and-control approach, which in turn, is better than no approach at all.
  5. A large percentage of organizations struggle to evolve their database schema in a timely manner, thereby reducing their competitiveness.
  6. Evolutionary/agile approaches to data modeling are just as effective as traditional approaches, and both approaches correlated to improved data quality.
  7. Database service-level agreements (SLAs) are co-related to improved data quality.
  8. This survey suffers from the fundamental challenges faced by all surveys.


Why Share This Much Information?

I’m sharing the results, and in particular the source data, of my surveys for several reasons:

  1. Other people can do a much better job of analysis than I can. If they publish online, I am more than happy to include links to their articles/papers.
  2. Once I’ve published my column summarizing the data in DDJ, I really don’t have any reason not to share the information.
  3. I think that it’s a good thing to do and I invite others to do the same.