Efficient Web form entry on PDAs
Proceedings of the 10th international conference on World Wide Web
Efficient Web form entry on PDAs
Proceedings of the 10th international conference on World Wide Web
Text Categorization Based on Regularized Linear Classification Methods
Information Retrieval
Proceedings of the 27th International Conference on Very Large Data Bases
Annals of Mathematics and Artificial Intelligence
A maximum entropy approach to named entity recognition
A maximum entropy approach to named entity recognition
A maximum-entropy-inspired parser
NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
An interactive clustering-based approach to integrating source query interfaces on the deep Web
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Understanding Web query interfaces: best-effort parsing with hidden syntax
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Structured databases on the web: observations and implications
ACM SIGMOD Record
Data & Knowledge Engineering
MetaQuerier: querying structured web sources on-the-fly
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Merging Source Query Interfaces onWeb Databases
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Query Selection Techniques for Efficient Crawling of Structured Web Sources
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Combining classifiers to identify online databases
Proceedings of the 16th international conference on World Wide Web
An adaptive crawler for locating hidden-Web entry points
Proceedings of the 16th international conference on World Wide Web
Learning to extract form labels
Proceedings of the VLDB Endowment
Siphon++: a hidden-webcrawler for keyword-based interfaces
Proceedings of the 17th ACM conference on Information and knowledge management
ODE: Ontology-assisted data extraction
ACM Transactions on Database Systems (TODS)
An empirical study on using hidden markov model for search interface segmentation
Proceedings of the 18th ACM conference on Information and knowledge management
A hierarchical approach to model web query interfaces for web source integration
Proceedings of the VLDB Endowment
Understanding deep web search interfaces: a survey
ACM SIGMOD Record
Real understanding of real estate forms
Proceedings of the International Conference on Web Intelligence, Mining and Semantics
Automatic hierarchical classification of structured deep web databases
WISE'06 Proceedings of the 7th international conference on Web Information Systems
Constructing interface schemas for search interfaces of web databases
WISE'05 Proceedings of the 6th international conference on Web Information Systems Engineering
Holistic schema matching for web query interfaces
EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology
OPAL: automated form understanding for the deep web
Proceedings of the 21st international conference on World Wide Web
OPAL: a passe-partout for web forms
Proceedings of the 21st international conference companion on World Wide Web
Optimal algorithms for crawling a hidden database in the web
Proceedings of the VLDB Endowment
Deep Web Query Interface Understanding and Integration
Deep Web Query Interface Understanding and Integration
The ontological key: automatically understanding and integrating forms to access the deep Web
The VLDB Journal — The International Journal on Very Large Data Bases
Hi-index | 0.00 |
Users submit queries to an online database via its query interface. Query interface parsing, which is important for many applications, understands the query capabilities of a query interface. Since most query interfaces are organized hierarchically, we present a novel query interface parsing method, StatParser (Statistical Parser), to automatically extract the hierarchical query capabilities of query interfaces. StatParser automatically learns from a set of parsed query interfaces and parses new query interfaces. StatParser starts from a small grammar and enhances the grammar with a set of probabilities learned from parsed query interfaces under the maximum-entropy principle. Given a new query interface, the probability-enhanced grammar identifies the parse tree with the largest global probability to be the query capabilities of the query interface. Experimental results show that StatParser very accurately extracts the query capabilities and can effectively overcome the problems of existing query interface parsers.