Using semantic web resources for data quality management

Authors:
Christian Fürber;Martin Hepp
Affiliations:
E-Business & Web Science Research Group, Neubiberg, Germany;E-Business & Web Science Research Group, Neubiberg, Germany
Venue:
EKAW'10 Proceedings of the 17th international conference on Knowledge engineering and management by the masses
Year:
2010

Citing 11
Cited 5

A translation approach to portable ontology specifications

Knowledge Acquisition - Special issue: Current issues in knowledge modeling
A product perspective on total data quality management

Communications of the ACM
The impact of poor data quality on the typical enterprise

Communications of the ACM
Data quality: the field guide

Data quality: the field guide
Data Quality for the Information Age

Data Quality for the Information Age
Data Quality: The Accuracy Dimension

Data Quality: The Accuracy Dimension
Semantic and schematic similarities between database objects: a context-based approach

The VLDB Journal — The International Journal on Very Large Data Bases
Data Quality: Concepts, Methodologies and Techniques (Data-Centric Systems and Applications)

Data Quality: Concepts, Methodologies and Techniques (Data-Centric Systems and Applications)
Beyond accuracy: what data quality means to data consumers

Journal of Management Information Systems
Quality-driven information filtering using the WIQA policy framework

Web Semantics: Science, Services and Agents on the World Wide Web
Querying Trust in RDF Data with tSPARQL

ESWC 2009 Heraklion Proceedings of the 6th European Semantic Web Conference on The Semantic Web: Research and Applications

Towards a vocabulary for data quality management in semantic web architectures

Proceedings of the 1st International Workshop on Linked Web Data Management
Learning to detect abnormal semantic web data

Proceedings of the sixth international conference on Knowledge capture
Recommendations using linked data

Proceedings of the 5th Ph.D. workshop on Information and knowledge
Quality assessment, provenance, and the web of linked sensor data

IPAW'12 Proceedings of the 4th international conference on Provenance and Annotation of Data and Processes
Test-driven evaluation of linked data quality

Proceedings of the 23rd international conference on World wide web

Quantified Score

Hi-index	0.00

Visualization

Abstract

The quality of data is a critical factor for all kinds of decision-making and transaction processing. While there has been a lot of research on data quality in the past two decades, the topic has not yet received sufficient attention from the Semantic Web community. In this paper, we discuss (1) the data quality issues related to the growing amount of data available on the Semantic Web, (2) how data quality problems can be handled within the Semantic Web technology framework, namely using SPARQL on RDF representations, and (3) how Semantic Web reference data, e.g. from DBPedia, can be used to spot incorrect literal values and functional dependency violations. We show how this approach can be used for data quality management of public Semantic Web data and data stored in relational databases in closed settings alike. As part of our work, we developed generic SPARQL queries to identify (1) missing datatype properties or literal values, (2) illegal values, and (3) functional dependency violations. We argue that using Semantic Web datasets reduces the effort for data quality management substantially. As a use-case, we employ Geonames, a publicly available Semantic Web resource for geographical data, as a trusted reference for managing the quality of other data sources.