Managerial decision support with knowledge of accuracy and completeness of the relational aggregate functions

Authors:
Amir Parssian
Affiliations:
University of Illinois at Springfield, One University Plaza, Springfield
Venue:
Decision Support Systems
Year:
2006

Citing 19
Cited 7

Incomplete Information in Relational Databases

Journal of the ACM (JACM)
Missing information (applicable and inapplicable) in relational databases

ACM SIGMOD Record
Extending relational algebra and relational calculus with set-valued attributes and aggregate functions

ACM Transactions on Database Systems (TODS)
Integrity = validity + completeness

ACM Transactions on Database Systems (TODS)
A family of incomplete relational database models

VLDB '89 Proceedings of the 15th international conference on Very large data bases
Statistical estimators for aggregate relational algebra queries

ACM Transactions on Database Systems (TODS)
Toward quality data: an attribute-based approach

Decision Support Systems - Special issue on information technologies and systems
Anchoring data quality dimensions in ontological foundations

Communications of the ACM
Can humans detect errors in data? Impact of base rates, incentives, and goals

MIS Quarterly
Modeling Information Manufacturing Systems to Determine Information Product Quality

Management Science
Estimating and improving the quality of information in a MIS

Communications of the ACM
Evaluating Aggregate Operations Over Imprecise Data

IEEE Transactions on Knowledge and Data Engineering
The Impact of Data Quality Information on Decision Making: An Exploratory Analysis

IEEE Transactions on Knowledge and Data Engineering
Completeness Information and Its Application to Query Processing

VLDB '86 Proceedings of the 12th International Conference on Very Large Data Bases
Simple Random Sampling from Relational Databases

VLDB '86 Proceedings of the 12th International Conference on Very Large Data Bases
Closed World Databases Opened Through Null Values

VLDB '88 Proceedings of the 14th International Conference on Very Large Data Bases
Estimating the Quality of Databases

FQAS '98 Proceedings of the Third International Conference on Flexible Query Answering Systems
Managing Information Quality

Managing Information Quality
Assessing Data Quality for Information Products: Impact of Selection, Projection, and Cartesian Product

Management Science

Impact of the Union and Difference Operations on the Quality of Information Products

Information Systems Research
Setting priorities for data accuracy improvements in satisficing decision-making scenarios: A guiding theory

Decision Support Systems
Using Data Mining Techniques to Discover Bias Patterns in Missing Data

Journal of Data and Information Quality (JDIQ)
The Effects and Interactions of Data Quality and Problem Complexity on Classification

Journal of Data and Information Quality (JDIQ)
GIGO or not GIGO: The Accuracy of Multi-Criteria Satisficing Decisions

Journal of Data and Information Quality (JDIQ)
Biases in multi-criteria, satisficing decisions due to data errors

Journal of Data and Information Quality (JDIQ)
A provenance-based approach to evaluate data quality in eScience

International Journal of Metadata, Semantics and Ontologies

Quantified Score

Hi-index	0.00

Visualization

Abstract

Aggregate data produced by decision support systems is utilized by managers in their decision making process to run or improve their firm's operations. Often, data residing in corporate databases and data warehouses are far from being perfect, and their imperfections have an impact on decision quality and outcome. Therefore, having knowledge about the effect of data errors on aggregate data could lead to more informed decisions, reduced risks, and competitive advantage. In this paper, we present a methodology to estimate the effects of data accuracy and completeness, as two important data quality dimensions, on the relational aggregate functions Count, Sum, Average, Max, and Min. Our methodology defines a set of attribute value types and deploys sampling strategies to determine the maximum likelihood estimates of each value type. We show the effect of data error rates on the scalar values returned by the aggregate functions and demonstrate the efficiency of our estimates by Monte Carlo simulations.