Automatic generation of probabilistic relationships for improving schema matching

Authors:
Laura Po;Serena Sorrentino
Affiliations:
II Department, University of Modena and Reggio Emilia, Via Vignolese 905, Modena 41125, Italy;II Department, University of Modena and Reggio Emilia, Via Vignolese 905, Modena 41125, Italy
Venue:
Information Systems
Year:
2011

Citing 26
Cited 4

Dempster's rule of combination is #P-complete (research note)

Artificial Intelligence
Semantic integration of semistructured and structured data sources

ACM SIGMOD Record
Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone

SIGDOC '86 Proceedings of the 5th annual international conference on Systems documentation
The object data standard: ODMG 3.0

The object data standard: ODMG 3.0
Generic Schema Matching with Cupid

Proceedings of the 27th International Conference on Very Large Data Bases
Comparison of Schema Matching Evaluations

Revised Papers from the NODe 2002 Web and Database-Related Workshops on Web, Web-Services, and Database Systems
ODB-Tools: A Description Logics Based Tool for Schema Validation and Semantic Query Optimization in Object Oriented Databases

AI*IA '97 Proceedings of the 5th Congress of the Italian Association for Artificial Intelligence on Advances in Artificial Intelligence
Clustering seasonality patterns in the presence of errors

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Synthesizing an Integrated Ontology

IEEE Internet Computing
Distinguishing systems and distinguishing senses: new evaluation methods for Word Sense Disambiguation

Natural Language Engineering
Schema and ontology matching with COMA++

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Disambiguating Nouns, Verbs, and Adjectives Using Automatically Acquired Selectional Preferences

Computational Linguistics
On the independence requirement in Dempster-Shafer theory for combining classifiers providing statistical evidence

Applied Intelligence
Ontology Matching

Ontology Matching
Domain kernels for word sense disambiguation

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Ensemble methods for unsupervised WSD

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Management of probabilistic data: foundations and challenges

Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Incorporating Uncertainty Metrics into a General-Purpose Data Integration System

SSDBM '07 Proceedings of the 19th International Conference on Scientific and Statistical Database Management
Bootstrapping pay-as-you-go data integration systems

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Word Sense Disambiguation as the Primary Step of Ontology Integration

DEXA '08 Proceedings of the 19th international conference on Database and Expert Systems Applications
Word sense disambiguation: A survey

ACM Computing Surveys (CSUR)
Schema Matching and Mapping-based Data Integration: Architecture, Approaches and Evaluation

Schema Matching and Mapping-based Data Integration: Architecture, Approaches and Evaluation
Data integration with uncertainty

The VLDB Journal — The International Journal on Very Large Data Bases
Schema Normalization for Improving Schema Matching

ER '09 Proceedings of the 28th International Conference on Conceptual Modeling
Mapping validation by probabilistic reasoning

ESWC'08 Proceedings of the 5th European semantic web conference on The semantic web: research and applications
On the Foundations of Relaxation Labeling Processes

IEEE Transactions on Pattern Analysis and Machine Intelligence

Automatic normalization and annotation for discovering semantic mappings

Search computing
Implementing database access control policy from unconstrained natural language text

Proceedings of the 2013 International Conference on Software Engineering
Building linked ontologies with high precision using subclass mapping discovery

Artificial Intelligence Review
Semantic annotation of the CEREALAB database by the AGROVOC linked dataset

ICCSA'13 Proceedings of the 13th international conference on Computational Science and Its Applications - Volume 1

Quantified Score

Hi-index	0.00

Visualization

Abstract

Schema matching is the problem of finding relationships among concepts across data sources that are heterogeneous in format and in structure. Starting from the ''hidden meaning'' associated with schema labels (i.e. class/attribute names), it is possible to discover lexical relationships among the elements of different schemata. In this work, we propose an automatic method aimed at discovering probabilistic lexical relationships in the environment of data integration ''on the fly''. Our method is based on a probabilistic lexical annotation technique, which automatically associates one or more meanings with schema elements w.r.t. a thesaurus/lexical resource. However, the accuracy of automatic lexical annotation methods on real-world schemata suffers from the abundance of non-dictionary words such as compound nouns and abbreviations. We address this problem by including a method to perform schema label normalization which increases the number of comparable labels. From the annotated schemata, we derive the probabilistic lexical relationships to be collected in the Probabilistic Common Thesaurus. The method is applied within the MOMIS data integration system but can easily be generalized to other data integration systems.