Mind your vocabulary: query mapping across heterogeneous information sources
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Data mining: concepts and techniques
Data mining: concepts and techniques
Efficient Web form entry on PDAs
Proceedings of the 10th international conference on World Wide Web
Efficient Web form entry on PDAs
Proceedings of the 10th international conference on World Wide Web
Semantic integration of heterogeneous information sources
Data & Knowledge Engineering - Special issue on heterogeneous information resources need semantic access
Proceedings of the 27th International Conference on Very Large Data Bases
Querying Heterogeneous Information Sources Using Source Descriptions
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Data extraction and label assignment for web databases
WWW '03 Proceedings of the 12th international conference on World Wide Web
Statistical schema matching across web query interfaces
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
OntoBuilder: Fully Automatic Extraction and Consolidation of Ontologies from Web Sources
ICDE '04 Proceedings of the 20th International Conference on Data Engineering
An interactive clustering-based approach to integrating source query interfaces on the deep Web
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Understanding Web query interfaces: best-effort parsing with hidden syntax
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Organizing structured web sources by query schemas: a clustering approach
Proceedings of the thirteenth ACM international conference on Information and knowledge management
WISE-cluster: clustering e-commerce search engines automatically
Proceedings of the 6th annual ACM international workshop on Web information and data management
Structured databases on the web: observations and implications
ACM SIGMOD Record
Light-weight domain-based form assistant: querying web databases on the fly
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Clustering e-commerce search engines based on their search interface pages using WISE-cluster
Data & Knowledge Engineering - Special issue: WIDM 2004
Wise-integrator: an automatic integrator of web search interfaces for E-commerce
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Combining Similarity and Distribution Features to Match Attributes
WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 03
An empirical study on using hidden markov model for search interface segmentation
Proceedings of the 18th ACM conference on Information and knowledge management
Dynamic personalization for meta-queriers
IRI'09 Proceedings of the 10th IEEE international conference on Information Reuse & Integration
Understanding deep web search interfaces: a survey
ACM SIGMOD Record
FAETON: Form Analysis and Extraction Tool for ONtology construction
International Journal of Computer Applications in Technology
Web database schema identification through simple query interface
RED'09 Proceedings of the 2nd international conference on Resource discovery
Real understanding of real estate forms
Proceedings of the International Conference on Web Intelligence, Mining and Semantics
A conceptual framework for efficient web crawling in virtual integration contexts
WISM'11 Proceedings of the 2011 international conference on Web information systems and mining - Volume Part II
A study on using two-phase conditional random fields for query interface segmentation
WISM'11 Proceedings of the 2011 international conference on Web information systems and mining - Volume Part II
Learning to discover complex mappings from web forms to ontologies
Proceedings of the 21st ACM international conference on Information and knowledge management
Understanding query interfaces by statistical parsing
ACM Transactions on the Web (TWEB)
The ontological key: automatically understanding and integrating forms to access the deep Web
The VLDB Journal — The International Journal on Very Large Data Bases
Hi-index | 0.00 |
Many databases have become Web-accessible through form-based search interfaces (i.e., HTML forms) that allow users to specify complex and precise queries to access the underlying databases. In general, such a Web search interface can be considered as containing an interface schema with multiple attributes and rich semantic/meta-information; however, the schema is not formally defined in HTML. Many Web applications, such as Web database integration and deep Web crawling, require the construction of the schemas. In this paper, we first propose a schema model for representing complex search interfaces, and then present a layout-expression based approach to automatically extract the logical attributes from search interfaces. We also rephrase the identification of different types of semantic information as a classification problem, and design several Bayesian classifiers to help derive semantic information from extracted attributes. A system, WISE-iExtractor, has been implemented to automatically construct the schema from any Web search interfaces. Our experimental results on real search interfaces indicate that this system is highly effective.