SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
The Earth Mover's Distance as a Metric for Image Retrieval
International Journal of Computer Vision
Proceedings of the 10th international conference on World Wide Web
Reconciling schemas of disparate data sources: a machine-learning approach
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Content integration for e-business
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Query clustering using user logs
ACM Transactions on Information Systems (TOIS)
Data integration: a theoretical perspective
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Generic Schema Matching with Cupid
Proceedings of the 27th International Conference on Very Large Data Bases
A survey of approaches to automatic schema matching
The VLDB Journal — The International Journal on Very Large Data Bases
Similarity Flooding: A Versatile Graph Matching Algorithm and Its Application to Schema Matching
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Semantic integration: a survey of ontology-based approaches
ACM SIGMOD Record
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Reference reconciliation in complex information spaces
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Schema and ontology matching with COMA++
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Tuning schema matching software using synthetic scenarios
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Semantic-integration research in the database community
AI Magazine - Special issue on semantic integration
Integration Workbench: Integrating Schema Integration Tools
ICDEW '06 Proceedings of the 22nd International Conference on Data Engineering Workshops
Microformats: a pragmatic path to the semantic web
Proceedings of the 15th international conference on World Wide Web
Data management projects at Google
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Ontology Matching
Unsupervised query segmentation using generative language models and wikipedia
Proceedings of the 17th international conference on World Wide Web
Introduction to special issue on query log analysis: Technology and ethics
ACM Transactions on the Web (TWEB)
Analyzing and revising data integration schemas to improve their matchability
Proceedings of the VLDB Endowment
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Matching Schemas in Online Communities: A Web 2.0 Approach
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Query recommendation using query logs in search engines
EDBT'04 Proceedings of the 2004 international conference on Current Trends in Database Technology
Graph-based search over web application model repositories
ICWE'11 Proceedings of the 11th international conference on Web engineering
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
ACM Transactions on Database Systems (TODS)
Semantic similarity measurement using historical google search patterns
Information Systems Frontiers
Hi-index | 0.00 |
We address the problem of unsupervised matching of schema information from a large number of data sources into the schema of a data warehouse. The matching process is the first step of a framework to integrate data feeds from third-party data providers into a structured-search engine's data warehouse. Our experiments show that traditional schema-based and instance-based schema matching methods fall short. We propose a new technique based on the search engine's clicklogs. Two schema elements are matched if the distribution of keyword queries that cause click-throughs on their instances are similar. We present experiments on large commercial datasets that show the new technique has much better accuracy than traditional techniques.