Holistic schema matching for web query interfaces

Authors:
Weifeng Su;Jiying Wang;Frederick Lochovsky
Affiliations:
Hong Kong University of Science & Technology, Hong Kong;City University, Hong Kong;Hong Kong University of Science & Technology, Hong Kong
Venue:
EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology
Year:
2006

Citing 13
Cited 14

Foundations of statistical natural language processing

Foundations of statistical natural language processing
Reconciling schemas of disparate data sources: a machine-learning approach

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Database intergration using neural networks: implementation and experiences

Knowledge and Information Systems
A survey of approaches to automatic schema matching

The VLDB Journal — The International Journal on Very Large Data Bases
Selecting the right interestingness measure for association patterns

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Statistical schema matching across web query interfaces

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Similarity Flooding: A Versatile Graph Matching Algorithm and Its Application to Schema Matching

ICDE '02 Proceedings of the 18th International Conference on Data Engineering
An interactive clustering-based approach to integrating source query interfaces on the deep Web

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
iMAP: discovering complex semantic matches between database schemas

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Discovering complex matchings across web query interfaces: a correlation mining approach

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Corpus-Based Schema Matching

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Schema Matching Using Duplicates

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Instance-based schema matching for web databases by domain-specific query probing

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30

Why is schema matching tough and what can we do about it?

ACM SIGMOD Record
Schema mapping verification: the spicy way

EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
Managing Uncertainty in Schema Matcher Ensembles

SUM '07 Proceedings of the 1st international conference on Scalable Uncertainty Management
Advances in Ontology Matching

Advances in Web Semantics I
ODE: Ontology-assisted data extraction

ACM Transactions on Database Systems (TODS)
A large dataset for the evaluation of ontology matching

The Knowledge Engineering Review
Category mapping for the automatic integration of category-constrained web search

International Journal of Business Intelligence and Data Mining
Semantic matching: algorithms and implementation

Journal on data semantics IX
Tuning the ensemble selection process of schema matchers

Information Systems
Reuse-oriented mapping discovery for meta-querier customization

DEXA'11 Proceedings of the 22nd international conference on Database and expert systems applications - Volume Part I
Multilingual schema matching for Wikipedia infoboxes

Proceedings of the VLDB Endowment
Matching Attributes across Overlapping Heterogeneous Data Sources Using Mutual Information

Journal of Database Management
Understanding query interfaces by statistical parsing

ACM Transactions on the Web (TWEB)
Hierarchical directory mapping for category-constrained meta-search

Journal of Intelligent Information Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

One significant part of today’s Web is Web databases, which can dynamically provide information in response to user queries. To help users submit queries to different Web databases, the query interface matching problem needs to be addressed. To solve this problem, we propose a new complex schema matching approach, Holistic Schema Matching (HSM). By examining the query interfaces of real Web databases, we observe that attribute matchings can be discovered from attribute-occurrence patterns. For example, First Name often appears together with Last Name while it is rarely co-present with Author in the Books domain. Thus, we design a count-based greedy algorithm to identify which attributes are more likely to be matched in the query interfaces. In particular, HSM can identify both simple matching i.e., 1:1 matching, and complex matching, i.e., 1:n or m:n matching, between attributes. Our experiments show that HSM can discover both simple and complex matchings accurately and efficiently on real data sets.