Aggregation of Imprecise and Uncertain Information in Databases

Authors:
S. McClean;B. Scotney;M. Shapcott
Affiliations:
-;-;-
Venue:
IEEE Transactions on Knowledge and Data Engineering
Year:
2001

Citing 13
Cited 21

Evaluating aggregates in possibilistic relational databases

Data & Knowledge Engineering
Framework for query optimization in distributed statistical databases

Information and Software Technology
A universal-scheme approach to statistical databases containing homogeneous summary tables

ACM Transactions on Database Systems (TODS)
Answering heterogeneous database queries with degrees of uncertainty

Distributed and Parallel Databases
Finding interesting rules from large sets of discovered association rules

CIKM '94 Proceedings of the third international conference on Information and knowledge management
Generalized union and project operations for pooling uncertain and imprecise information

Data & Knowledge Engineering
Fast discovery of association rules

Advances in knowledge discovery and data mining
Optimal and efficient integration of heterogeneous summary tables in a distributed database

Data & Knowledge Engineering
Resolving Database Incompatibility: An Approach to Performing Relational Operations over Mismatched Domains

IEEE Transactions on Knowledge and Data Engineering
The Management of Probabilistic Data

IEEE Transactions on Knowledge and Data Engineering
Evaluating Aggregate Operations Over Imprecise Data

IEEE Transactions on Knowledge and Data Engineering
Current Approaches to Handling Imperfect Information in Data and Knowledge Bases

IEEE Transactions on Knowledge and Data Engineering
Designing a Kernel for Data Mining

IEEE Expert: Intelligent Systems and Their Applications

A Scalable Approach to Integrating Heterogeneous Aggregate Views of Distributed Databases

IEEE Transactions on Knowledge and Data Engineering
Temporal Probabilistic Concepts from Heterogeneous Data Sequences

Soft-Ware 2002 Proceedings of the First International Conference on Computing in an Imperfect World
Learning with Concept Hierarchies in Probabilistic Relational Data Mining

WAIM '02 Proceedings of the Third International Conference on Advances in Web-Age Information Management
Conceptual Clustering of Heterogeneous Sequences via Schema Mapping

ISMIS '02 Proceedings of the 13th International Symposium on Foundations of Intelligent Systems
Conceptual Clustering of Heterogeneous GeneExpression Sequences

Artificial Intelligence Review
Database aggregation of imprecise and uncertain evidence

Information Sciences—Informatics and Computer Science: An International Journal - special issue: Knowledge discovery from distributed information sources
OLAP over uncertain and imprecise data

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Learning accurate and concise naïve Bayes classifiers from attribute value taxonomies and data

Knowledge and Information Systems
OLAP over uncertain and imprecise data

The VLDB Journal — The International Journal on Very Large Data Bases
Efficient aggregation algorithms for probabilistic data

SODA '07 Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms
Data mining research for customer relationship management systems: a framework and analysis

International Journal of Business Information Systems
Integrating semantically heterogeneous aggregate views of distributed databases

Distributed and Parallel Databases
Generating efficient safe query plans for probabilistic databases

Data & Knowledge Engineering
Estimating and bounding aggregations in databases with referential integrity errors

Proceedings of the ACM 11th international workshop on Data warehousing and OLAP
Context reasoning using extended evidence theory in pervasive computing environments

Future Generation Computer Systems
Extended aggregations for databases with referential integrity issues

Data & Knowledge Engineering
Knowledge discovery from semantically heterogeneous aggregate databases using model-based clustering

BNCOD'07 Proceedings of the 24th British national conference on Databases
Efficiently computing and querying multidimensional OLAP data cubes over probabilistic relational data

ADBIS'10 Proceedings of the 14th east European conference on Advances in databases and information systems
Algorithms and software for collaborative discovery from autonomous, semantically heterogeneous, distributed information sources

ALT'05 Proceedings of the 16th international conference on Algorithmic Learning Theory
Learning ontology-aware classifiers

DS'05 Proceedings of the 8th international conference on Discovery Science
The ordered multiplicative modular geometric operator

Knowledge-Based Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Information stored in a database is often subject to uncertainty and imprecision. Probability theory provides a well-known and well understood way of representing uncertainty and may thus be used to provide a mechanism for storing uncertain information in a database. We consider the problem of aggregation using an imprecise probability data model that allows us to represent imprecision by partial probabilities and uncertainty using probability distributions. Most work to date has concentrated on providing functionality for extending the relational algebra with a view to executing traditional queries on uncertain or imprecise data. However, for imprecise and uncertain data, we often require aggregation operators that provide information on patterns in the data. Thus, while traditional query processing is tuple-driven, processing of uncertain data is often attribute-driven where we use aggregation operators to discover attribute properties. The aggregation operator that we define uses the Kullback-Leibler information divergence between the aggregated probability distribution and the individual tuple values to provide a probability distribution for the domain values of an attribute or group of attributes. The provision of such aggregation operators is a central requirement in furnishing a database with the capability to perform the operations necessary for knowledge discovery in databases.