On the foundations of probabilistic information integration

Authors:
Fereidoon Sadri
Affiliations:
University of North Carolina at Greensboro, Greensboro, NC, USA
Venue:
Proceedings of the 21st ACM international conference on Information and knowledge management
Year:
2012

Citing 39
Cited 1

On the representation and querying of sets of possible worlds

SIGMOD '87 Proceedings of the 1987 ACM SIGMOD international conference on Management of data
Data-driven understanding and refinement of schema mappings

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Generic Schema Matching with Cupid

Proceedings of the 27th International Conference on Very Large Data Bases
Using Probabilistic Information in Data Integration

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
A survey of approaches to automatic schema matching

The VLDB Journal — The International Journal on Very Large Data Bases
RACCOON: A Peer-Based System for Data Integration and Sharing

ICDE '04 Proceedings of the 20th International Conference on Data Engineering
iMAP: discovering complex semantic matches between database schemas

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Introduction to the special issue on semantic integration

ACM SIGMOD Record
Schema mediation for large-scale semantic data sharing

The VLDB Journal — The International Journal on Very Large Data Bases
Corpus-Based Schema Matching

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Supporting executable mappings in model management

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Enterprise information integration: successes, challenges and controversies

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Data sharing in the Hyperion peer database system

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Semantic-integration research in the database community

AI Magazine - Special issue on semantic integration
Working Models for Uncertain Data

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Data integration: the teenage years

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Model management 2.0: manipulating richer mappings

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Compiling mappings to bridge applications and databases

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
ORCHESTRA: facilitating collaborative data sharing

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Efficient query evaluation on probabilistic databases

The VLDB Journal — The International Journal on Very Large Data Bases
COMA: a system for flexible combination of schema matching approaches

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
ProTDB: probabilistic data in XML

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Efficient query evaluation on probabilistic databases

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Data integration with uncertainty

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Data exchange with data-metadata translations

Proceedings of the VLDB Endowment
Query optimization in xml-based information integration

Proceedings of the 17th ACM conference on Information and knowledge management
Data integration with uncertainty

The VLDB Journal — The International Journal on Very Large Data Bases
Probabilistic databases: diamonds in the dirt

Communications of the ACM - Barbara Liskov: ACM's A.M. Turing Award Winner
Fast and Simple Relational Processing of Uncertain Data

ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
SPROUT: Lazy vs. Eager Query Plans for Tuple-Independent Probabilistic Databases

ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Information integration with uncertainty

IDEAS '09 Proceedings of the 2009 International Database Engineering & Applications Symposium
Representing uncertain data: models, properties, and algorithms

The VLDB Journal — The International Journal on Very Large Data Bases
Integrating conflicting data: the role of source dependence

Proceedings of the VLDB Endowment
A Survey on Uncertainty Management in Data Integration

Journal of Data and Information Quality (JDIQ)
Foundations of uncertain-data integration

Proceedings of the VLDB Endowment
Schema Matching and Mapping

Schema Matching and Mapping
Beauty and the beast: the theory and practice of information integration

ICDT'07 Proceedings of the 11th international conference on Database Theory
Evaluating Probabilistic Queries over Uncertain Matching

ICDE '12 Proceedings of the 2012 IEEE 28th International Conference on Data Engineering
Efficient management of uncertainty in XML schema matching

The VLDB Journal — The International Journal on Very Large Data Bases

A compact representation for efficient uncertain-information integration

Proceedings of the 17th International Database Engineering & Applications Symposium

Quantified Score

Hi-index	0.00

Visualization

Abstract

Information integration has been a subject of research for several decades and still remains a very active research area. Many new applications depend or benefit from large scale integration. Examples include large research projects in life sciences, need for data sharing among government agencies, reliance of corporations on business intelligence (which requires data integration from many heterogeneous sources), and integration of information on the web. The importance of information integration with uncertainty has been observed in recent years. Frequently, information from multiple sources are uncertain and possibly inconsistent. Further the process of integration often depends on approximate schema mappings, another source of uncertainty. An integration system is useful only to the extent that the information it produces can be trusted. Hence, providing a measure of certainty for integrated information is of crucial importance in many important applications. In this paper we study the problem of integration of uncertain information. We present a simple and intuitive approach to the representation and integration of uncertain information from multiple sources, and show that our integration approach coincides with a recent formalism for uncertain information integration. We extend the model to probabilistic possible-worlds, and show certain unintuitive constraints are imposed upon probabilities of possible-worlds of sources. In particular, we show the probabilities of possible worlds of a source are not independent, rather, they are dependent on probabilities of other sources. We study the problem of determining the probabilities for the result of integration. Finally, we present a practical approach to relaxing probabilistic constraints in integration.