Improving Generalization with Active Learning
Machine Learning - Special issue on structured connectionist systems
Issues and approaches of database integration
Communications of the ACM
Data & Knowledge Engineering
Information Integration: The MOMIS Project Demonstration
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Generic Schema Matching with Cupid
Proceedings of the 27th International Conference on Very Large Data Bases
Value-added Mediation in Large-Scale Information Systems
DS-6 Proceedings of the Sixth IFIP TC-2 Working Conference on Data Semantics: Database Applications Semantics
Learning to map between structured representations of data
Learning to map between structured representations of data
Software—Practice & Experience
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
COMA: a system for flexible combination of schema matching approaches
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Category translation: learning to understand information on the internet
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 1
Query-answering algorithms for information agents
AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1
GSMA: a structural matching algorithm for schema matching in data warehousing
FSKD'05 Proceedings of the Second international conference on Fuzzy Systems and Knowledge Discovery - Volume Part II
Finding top-k similar pairs of objects annotated with terms from an ontology
SSDBM'10 Proceedings of the 22nd international conference on Scientific and statistical database management
WS-Aggregation: distributed aggregation of web services data
Proceedings of the 2011 ACM Symposium on Applied Computing
Hi-index | 0.00 |
The adoption of standards for exchanging information across the Web presents both new opportunities and important challenges for data integration and aggregation. Although Web Services simplify the discovery and access of information sources, the problem of semantic heterogeneity remains: how to find semantic correspondences across the data being integrated. In this paper, we explore these issues in the context of Web Services, and propose OATS, a novel algorithm for schema matching that is specifically suited to Web Service data aggregation. We show how probing Web Services with a small set of related queries results in semantically correlated data instances which greatly simplifies the matching process, and demonstrate that the use of an ensemble of string distance metrics in matching data instances performs better than individual metrics. We also show how the choice of probe queries has a dramatic effect on matching accuracy. Motivated by this observation, we describe and evaluate an machine learning approach to selecting probes to maximise accuracy while minimising cost.