A Theory of Attributed Equivalence in Databases with Application to Schema Integration
IEEE Transactions on Software Engineering
Federated database systems for managing distributed, heterogeneous, and autonomous databases
ACM Computing Surveys (CSUR) - Special issue on heterogeneous databases
Data & Knowledge Engineering
Data integration using similarity joins and a word-based information representation language
ACM Transactions on Information Systems (TOIS)
Reconciling schemas of disparate data sources: a machine-learning approach
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Learning object identification rules for information integration
Information Systems - Data extraction, cleaning and reconciliation
Information Retrieval
Machine Learning
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
Improving Short-Text Classification using Unlabeled Data for Classification Problems
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Schema Mapping as Query Discovery
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Proceedings of the 27th International Conference on Very Large Data Bases
Potter's Wheel: An Interactive Data Cleaning System
Proceedings of the 27th International Conference on Very Large Data Bases
Generic Schema Matching with Cupid
Proceedings of the 27th International Conference on Very Large Data Bases
A survey of approaches to automatic schema matching
The VLDB Journal — The International Journal on Very Large Data Bases
Interactive deduplication using active learning
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Statistical schema matching across web query interfaces
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Similarity Flooding: A Versatile Graph Matching Algorithm and Its Application to Schema Matching
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Matching Theory (North-Holland mathematics studies)
Matching Theory (North-Holland mathematics studies)
COMA: a system for flexible combination of schema matching approaches
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Wise-integrator: an automatic integrator of web search interfaces for E-commerce
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Merging models based on given correspondences
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Corpus-based knowledge representation
IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Introduction to the special issue on semantic integration
ACM SIGMOD Record
A holistic paradigm for large scale schema matching
ACM SIGMOD Record
Editorial: special issue on web content mining
ACM SIGKDD Explorations Newsletter
Mining structures for semantics
ACM SIGKDD Explorations Newsletter
Mining semantics for large scale integration on the web: evidences, insights, and challenges
ACM SIGKDD Explorations Newsletter
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
QA-Pagelet: Data Preparation Techniques for Large-Scale Data Analysis of the Deep Web
IEEE Transactions on Knowledge and Data Engineering
Making holistic schema matching robust: an ensemble approach
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Light-weight domain-based form assistant: querying web databases on the fly
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Tuning schema matching software using synthetic scenarios
VLDB '05 Proceedings of the 31st international conference on Very large data bases
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Semantic-integration research in the database community
AI Magazine - Special issue on semantic integration
Merging Interface Schemas on the Deep Web via Clustering Aggregation
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Automatic complex schema matching across Web query interfaces: A correlation mining approach
ACM Transactions on Database Systems (TODS)
A Survey of Web Information Extraction Systems
IEEE Transactions on Knowledge and Data Engineering
Meaningful labeling of integrated query interfaces
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
eTuner: tuning schema matching software using synthetic scenarios
The VLDB Journal — The International Journal on Very Large Data Bases
Automatically constructing collections of online database directories
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Clustering e-commerce search engines based on their search interface pages using WISE-cluster
Data & Knowledge Engineering - Special issue: WIDM 2004
Combining classifiers to identify online databases
Proceedings of the 16th international conference on World Wide Web
An adaptive crawler for locating hidden-Web entry points
Proceedings of the 16th international conference on World Wide Web
Context-aware wrapping: synchronized data extraction
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
SeMap: a generic mapping construction system
EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
Pay-as-you-go user feedback for dataspace systems
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Data & Knowledge Engineering
Schema Matching across Query Interfaces on the Deep Web
BNCOD '08 Proceedings of the 25th British national conference on Databases: Sharing Data, Information and Knowledge
Learning to extract form labels
Proceedings of the VLDB Endowment
Proceedings of the VLDB Endowment
Integrating web query results: holistic schema matching
Proceedings of the 17th ACM conference on Information and knowledge management
Web-scale extraction of structured data
ACM SIGMOD Record
Efficiently incorporating user feedback into information extraction and integration programs
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
A Prioritized Collective Selection Strategy for Schema Matching across Query Interfaces
BNCOD 26 Proceedings of the 26th British National Conference on Databases: Dataspace: The Final Frontier
Clustering with Constrained Similarity Learning
WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 03
An empirical study on using hidden markov model for search interface segmentation
Proceedings of the 18th ACM conference on Information and knowledge management
An evidential approach to query interface matching on the deep Web
Information Systems
Kosmix: high-performance topic exploration using the deep web
Proceedings of the VLDB Endowment
A hierarchical approach to model web query interfaces for web source integration
Proceedings of the VLDB Endowment
Stop word and related problems in web interface integration
Proceedings of the VLDB Endowment
Wrapping of Web Sources with restricted Query Interfaces by Query Tunneling
Electronic Notes in Theoretical Computer Science (ENTCS)
Automatically constructing a directory of molecular biology databases
DILS'07 Proceedings of the 4th international conference on Data integration in the life sciences
Liquid query: multi-domain exploratory search on the web
Proceedings of the 19th international conference on World wide web
Creating and exploring web form repositories
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Understanding deep web search interfaces: a survey
ACM SIGMOD Record
PruSM: a prudent schema matching approach for web forms
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Web database schema identification through simple query interface
RED'09 Proceedings of the 2nd international conference on Resource discovery
Designing service marts for engineering search computing applications
ICWE'10 Proceedings of the 10th international conference on Web engineering
Instance discovery and schema matching with applications to biological deep web data integration
DILS'10 Proceedings of the 7th international conference on Data integration in the life sciences
Deep web integration with VisQI
Proceedings of the VLDB Endowment
A structure-based similarity spreading approach for ontology matching
SUM'10 Proceedings of the 4th international conference on Scalable uncertainty management
Materializing multi-relational databases from the web using taxonomic queries
Proceedings of the fourth ACM international conference on Web search and data mining
A query interface matching approach based on extended evidence theory for deep web
Journal of Computer Science and Technology
Human-assisted graph search: it's okay to ask questions
Proceedings of the VLDB Endowment
ETTA-IM: A deep web query interface matching approach based on evidence theory and task assignment
Expert Systems with Applications: An International Journal
Real understanding of real estate forms
Proceedings of the International Conference on Web Intelligence, Mining and Semantics
Measuring similarity of chinese web databases based on category hierarchy
APWeb'11 Proceedings of the 13th Asia-Pacific web conference on Web technologies and applications
Layout object model for extracting the schema of web query interfaces
APWeb'11 Proceedings of the 13th Asia-Pacific web conference on Web technologies and applications
Search, adapt, and reuse: the future of scientific workflows
ACM SIGMOD Record
Reuse-oriented mapping discovery for meta-querier customization
DEXA'11 Proceedings of the 22nd international conference on Database and expert systems applications - Volume Part I
A study on using two-phase conditional random fields for query interface segmentation
WISM'11 Proceedings of the 2011 international conference on Web information systems and mining - Volume Part II
Unsupervised transactional query classification based on webpage form understanding
Proceedings of the 20th ACM international conference on Information and knowledge management
Automatically mapping and integrating multiple data entry forms into a database
ER'11 Proceedings of the 30th international conference on Conceptual modeling
Ontology-based HTML to XML conversion
WAIM'05 Proceedings of the 6th international conference on Advances in Web-Age Information Management
Constructing interface schemas for search interfaces of web databases
WISE'05 Proceedings of the 6th international conference on Web Information Systems Engineering
Holistic schema matching for web query interfaces
EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology
A novel clustering-based approach to schema matching
ADVIS'06 Proceedings of the 4th international conference on Advances in Information Systems
Automatic identification of web query interfaces
MICAI'11 Proceedings of the 10th international conference on Artificial Intelligence: advances in Soft Computing - Volume Part II
Bootstrapping domain ontology for semantic web services from source web sites
TES'05 Proceedings of the 6th international conference on Technologies for E-Services
ProFoUnd: program-analysis-based form understanding
Proceedings of the 21st international conference companion on World Wide Web
Extracting widget descriptions from GUIs
FASE'12 Proceedings of the 15th international conference on Fundamental Approaches to Software Engineering
Learning to discover complex mappings from web forms to ontologies
Proceedings of the 21st ACM international conference on Information and knowledge management
Topic-Sensitive hidden-web crawling
WISE'12 Proceedings of the 13th international conference on Web Information Systems Engineering
Automatic discovery of Web Query Interfaces using machine learning techniques
Journal of Intelligent Information Systems
E-FFC: an enhanced form-focused crawler for domain-specific deep web databases
Journal of Intelligent Information Systems
Deep Web Information Retrieval Process: A Technical Survey
International Journal of Information Technology and Web Engineering
Matching Attributes across Overlapping Heterogeneous Data Sources Using Mutual Information
Journal of Database Management
Proactive natural language search engine: tapping into structured data on the web
Proceedings of the 16th International Conference on Extending Database Technology
Understanding query interfaces by statistical parsing
ACM Transactions on the Web (TWEB)
Hi-index | 0.00 |
An increasing number of data sources now become available on the Web, but often their contents are only accessible through query interfaces. For a domain of interest, there often exist many such sources with varied coverage or querying capabilities. As an important step to the integration of these sources, we consider the integration of their query interfaces. More specifically, we focus on the crucial step of the integration: accurately matching the interfaces. While the integration of query interfaces has received more attentions recently, current approaches are not sufficiently general: (a) they all model interfaces with flat schemas; (b) most of them only consider 1:1 mappings of fields over the interfaces; (c) they all perform the integration in a blackbox-like fashion and the whole process has to be restarted from scratch if anything goes wrong; and (d) they often require laborious parameter tuning. In this paper, we propose an interactive, clustering-based approach to matching query interfaces. The hierarchical nature of interfaces is captured with ordered trees. Varied types of complex mappings of fields are examined and several approaches are proposed to effectively identify these mappings. We put the human integrator back in the loop and propose several novel approaches to the interactive learning of parameters and the resolution of uncertain mappings. Extensive experiments are conducted and results show that our approach is highly effective.