Searching distributed collections with inference networks
SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Experiences with selecting search engines using metasearch
ACM Transactions on Information Systems (TOIS)
Methods for information server selection
ACM Transactions on Information Systems (TOIS)
A decision-theoretic approach to database selection in networked IR
ACM Transactions on Information Systems (TOIS)
GlOSS: text-source discovery over the Internet
ACM Transactions on Database Systems (TODS)
Server selection on the World Wide Web
DL '00 Proceedings of the fifth ACM conference on Digital libraries
Probe, count, and classify: categorizing hidden web databases
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Query-based sampling of text databases
ACM Transactions on Information Systems (TOIS)
Approaches to collection selection and results merging for distributed information retrieval
Proceedings of the tenth international conference on Information and knowledge management
Context and Page Analysis for Improved Web Search
IEEE Internet Computing
Boolean Query Mapping Across Heterogeneous Information Sources
IEEE Transactions on Knowledge and Data Engineering
Server Ranking for Distributed Text Retrieval Systems on the Internet
Proceedings of the Fifth International Conference on Database Systems for Advanced Applications (DASFAA)
Query Translation for Distributed Information Gathering on the Web
IDEAS '98 Proceedings of the 1998 International Symposium on Database Engineering & Applications
Performance and cost tradeoffs in Web search
ADC '04 Proceedings of the 15th Australasian database conference - Volume 27
Testbed for information extraction from deep web
Proceedings of the 13th international World Wide Web conference on Alternate track papers & posters
Downloading textual hidden web content through keyword queries
Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries
Server selection methods in hybrid portal search
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Identifying redundant search engines in a very large scale metasearch engine context
WIDM '06 Proceedings of the 8th annual ACM international workshop on Web information and data management
Clustering e-commerce search engines based on their search interface pages using WISE-cluster
Data & Knowledge Engineering - Special issue: WIDM 2004
Combining classifiers to identify online databases
Proceedings of the 16th international conference on World Wide Web
AllInOneNews: development and evaluation of a large-scale news metasearch engine
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
MySearchView: a customized metasearch engine generator
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Generation of Specifications Forms through Statistical Learning for a Universal Services Marketplace
WISE '09 Proceedings of the 10th International Conference on Web Information Systems Engineering
Ontology-based focused crawling of deep web sources
KSEM'07 Proceedings of the 2nd international conference on Knowledge science, engineering and management
Collaborative identification and annotation of government deep web resources: a hybrid approach
Proceedings of the 21st ACM conference on Hypertext and hypermedia
On building a search interface discovery system
RED'09 Proceedings of the 2nd international conference on Resource discovery
Searchable web sites recommendation
Proceedings of the fourth ACM international conference on Web search and data mining
Foundations and Trends in Information Retrieval
Layout object model for extracting the schema of web query interfaces
APWeb'11 Proceedings of the 13th Asia-Pacific web conference on Web technologies and applications
Automated extraction of hit numbers from search result pages
WAIM '06 Proceedings of the 7th international conference on Advances in Web-Age Information Management
Automatic identification of web query interfaces
MICAI'11 Proceedings of the 10th international conference on Artificial Intelligence: advances in Soft Computing - Volume Part II
Automatic discovery of Web Query Interfaces using machine learning techniques
Journal of Intelligent Information Systems
E-FFC: an enhanced form-focused crawler for domain-specific deep web databases
Journal of Intelligent Information Systems
Deep Web Information Retrieval Process: A Technical Survey
International Journal of Information Technology and Web Engineering
Hi-index | 0.01 |
Web search engines work well for finding crawlable pages, but not for finding datasets hidden behind Web search forms. We describe a novel technique for detecting search forms, which could be the basis for a next-generation distributed search application. We use automatic feature generation to describe candidate forms and C4.5 decision trees to classify them. In two testbeds, we get an accuracy of more than 85% and a precision of more than 87%. One of our decision trees is effective on both testbeds, suggesting that it is a useful general-purpose tree.