Data quality describes the correctness, relevance and reliability of data. It describes their consistency and availability on different computer systems. In information technology, data quality is one of the most important criteria for assessing an IT system.
Data quality is evaluated according to many different criteria. Here are the most important ones, which relate to the system, the presentation of the data, its use and its content:
- System: accessibility, editability,
- Presentation: comprehensibility, clarity, consistency, interpretability/ consistency,
- Usage: relevance, timeliness, value added, completeness, appropriateness,
- Content: accuracy/ error-free, objectivity, credibility, reputation.
In 2007, the German Society for Information and Data Quality (DGIQ) published a user-based definition based on the 1996 research findings of Richard Wang and Diane Strong. It is based on the 15 dimensions of information quality identified by Wang/Strong, which are shown in the diagram. According to this study, almost half of the companies see good data quality as a success and competitive advantage and attach great importance to this topic.
In practice, the measure of data quality is understood to be how good the consistency and availability of the same data is on different systems. How cleanly the existing data has been stored and how well the presentation matches the expected results. The data quality must be monitored in the acquisition system itself, but also during the transfer to target systems (e.g. data warehousesystems) and alternative representations (e.g. evaluations). For this purpose, suitable quality assurance measures must be implemented at each stage of the data flow process.