A product perspective on total data quality management
Communications of the ACM
The impact of poor data quality on the typical enterprise
Communications of the ACM
Communications of the ACM - Supporting community and building social capital
Data Quality for the Information Age
Data Quality for the Information Age
AIMQ: a methodology for information quality assessment
Information and Management
Quality-driven Integration of Heterogenous Information Systems
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Potter's Wheel: An Interactive Data Cleaning System
Proceedings of the 27th International Conference on Very Large Data Bases
A Retrospective on Industrial Database Reverse Engineering Projects-Part 1
WCRE '01 Proceedings of the Eighth Working Conference on Reverse Engineering (WCRE'01)
Exploratory Data Mining and Data Cleaning
Exploratory Data Mining and Data Cleaning
Composing Web services on the Semantic Web
The VLDB Journal — The International Journal on Very Large Data Bases
A framework for analysis of data freshness
Proceedings of the 2004 international workshop on Information quality in information systems
Methods for evaluating and creating data quality
Information Systems - Special issue: Data quality in cooperative information systems
Information Systems - Special issue: Data quality in cooperative information systems
Making quality count in biological data sources
Proceedings of the 2nd international workshop on Information quality in information systems
HICSS '06 Proceedings of the 39th Annual Hawaii International Conference on System Sciences - Volume 05
Towards a Quality Model for Effective Data Selection in Collaboratories
ICDEW '06 Proceedings of the 22nd International Conference on Data Engineering Workshops
Towards the Management of Information Quality in Proteomics
CBMS '06 Proceedings of the 19th IEEE Symposium on Computer-Based Medical Systems
Quality views: capturing and exploiting the user perspective on data quality
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Data Quality: Concepts, Methodologies and Techniques (Data-Centric Systems and Applications)
Data Quality: Concepts, Methodologies and Techniques (Data-Centric Systems and Applications)
Beyond accuracy: what data quality means to data consumers
Journal of Management Information Systems
Duplicate Record Detection: A Survey
IEEE Transactions on Knowledge and Data Engineering
Checks and balances: monitoring data quality problems in network traffic databases
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Declarative XML data cleaning with XClean
CAiSE'07 Proceedings of the 19th international conference on Advanced information systems engineering
Accelerating disease gene identification through integrated SNP data analysis
DILS'07 Proceedings of the 4th international conference on Data integration in the life sciences
Quality-driven query answering for integrated information systems
Quality-driven query answering for integrated information systems
Managing information quality in e-science using semantic web technology
ESWC'06 Proceedings of the 3rd European conference on The Semantic Web: research and applications
Incorporating the timeliness quality dimension in internet query systems
WISE'05 Proceedings of the 2005 international conference on Web Information Systems Engineering
Data quality through model checking techniques
IDA'11 Proceedings of the 10th international conference on Advances in intelligent data analysis X
Less is more: selecting sources wisely for integration
Proceedings of the VLDB Endowment
Hi-index | 0.00 |
The range of information now available in queryable repositories opens up a host of possibilities for new and valuable forms of data analysis. Database query languages such as SQL and XQuery offer a concise and high-level means by which such analyses can be implemented, facilitating the extraction of relevant data subsets into either generic or bespoke data analysis environments. Unfortunately, the quality of data in these repositories is often highly variable. The data is still useful, but only if the consumer is aware of the data quality problems and can work around them. Standard query languages offer little support for this aspect of data management. In principle, however, it should be possible to embed constraints describing the consumer’s data quality requirements into the query directly, so that the query evaluator can take over responsibility for enforcing them during query processing. Most previous attempts to incorporate information quality constraints into database queries have been based around a small number of highly generic quality measures, which are defined and computed by the information provider. This is a useful approach in some application areas but, in practice, quality criteria are more commonly determined by the user of the information not by the provider. In this article, we explore an approach to incorporating quality constraints into database queries where the definition of quality is set by the user and not the provider of the information. Our approach is based around the concept of a quality view, a configurable quality assessment component into which domain-specific notions of quality can be embedded. We examine how quality views can be incorporated into XQuery, and draw from this the language features that are required in general to embed quality views into any query language. We also propose some syntactic sugar on top of XQuery to simplify the process of querying with quality constraints.