Data quality and due process in large interorganizational record systems
Communications of the ACM
Statistical profile estimation in database systems
ACM Computing Surveys (CSUR)
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Data quality and systems theory
Communications of the ACM
Join synopses for approximate query answering
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Ripple joins for online aggregation
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Equivalence of Relational Algebra and Relational Calculus Query Languages Having Aggregate Functions
Journal of the ACM (JACM)
Congressional samples for approximate answering of group-by queries
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Assessing data quality for information products
ICIS '99 Proceedings of the 20th international conference on Information Systems
Database Systems Design, Implementation and Management
Database Systems Design, Implementation and Management
Information Systems Control and Audit
Information Systems Control and Audit
Estimating the Quality of Databases
FQAS '98 Proceedings of the Third International Conference on Flexible Query Answering Systems
Completeness of integrated information sources
Information Systems - Special issue: Data quality in cooperative information systems
Monte Carlo Statistical Methods (Springer Texts in Statistics)
Monte Carlo Statistical Methods (Springer Texts in Statistics)
Beyond accuracy: what data quality means to data consumers
Journal of Management Information Systems
Overview and Framework for Data and Information Quality Research
Journal of Data and Information Quality (JDIQ)
Impact of the Union and Difference Operations on the Quality of Information Products
Information Systems Research
Automatic accuracy assessment via hashing in multiple-source environment
Expert Systems with Applications: An International Journal
GIGO or not GIGO: The Accuracy of Multi-Criteria Satisficing Decisions
Journal of Data and Information Quality (JDIQ)
Firms' information security investment decisions: Stock market evidence of investors' behavior
Decision Support Systems
Biases in multi-criteria, satisficing decisions due to data errors
Journal of Data and Information Quality (JDIQ)
A quality framework for data integration
BNCOD'10 Proceedings of the 27th British national conference on Data Security and Security Data
Probabilistically ranking web article quality based on evolution patterns
Transactions on Large-Scale Data- and Knowledge-Centered Systems VI
Hi-index | 0.00 |
The quality of data in relational databases is often uncertain, and the relationship between the quality of the underlying base tables and the set of potential query results, a type of information product (IP), that could be produced from them has not been fully investigated. This paper provides a basis for the systematic analysis of the quality of such IPs. This research uses the relational algebra framework to develop estimates for the quality of query results based on the quality estimates of samples taken from the base tables. Our procedure requires an initial sample from the base tables; these samples are then used for all possible information IPs. Each specific query governs the quality assessment of the relevant samples. By using the same sample repeatedly, our approach is relatively cost effective. We introduce the Reference-Table Procedure, which can be used for quality estimation in general. In addition, for each of the basic algebraic operators, we discuss simpler procedures that may be applicable. Special attention is devoted to the Join operation. We examine various, relevant statistical issues, including how to deal with the impact on quality of missing rows in base tables. Finally, we address several implementation issues related to sampling.