Probe, count, and classify: categorizing hidden web databases
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Data extraction and label assignment for web databases
WWW '03 Proceedings of the 12th international conference on World Wide Web
Statistical schema matching across web query interfaces
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Concept Hierarchy Based Text Database Categorization in a Metasearch Engine Environment
WISE '00 Proceedings of the First International Conference on Web Information Systems Engineering (WISE'00)-Volume 1 - Volume 1
An interactive clustering-based approach to integrating source query interfaces on the deep Web
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
WebIQ: Learning from the Web to Match Deep-Web Query Interfaces
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Wise-integrator: an automatic integrator of web search interfaces for E-commerce
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Instance-based schema matching for web databases by domain-specific query probing
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Leveraging Webpage Classification for Data Object Recognition
WI '07 Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence
Learning semantic definitions of online information sources
Journal of Artificial Intelligence Research
Automatic discovery of Web Query Interfaces using machine learning techniques
Journal of Intelligent Information Systems
Automatic classification of web databases using domain-dictionaries
MLDM'13 Proceedings of the 9th international conference on Machine Learning and Data Mining in Pattern Recognition
Hi-index | 0.00 |
Web databases provide different types of query interfaces to access the data records stored in the backend databases. While most existing works exploit a complex query interface with multiple input fields to perform schema identification of the Web databases, little attention has been paid on how to identify the schema of web databases by simple query interface (SQI), which has only one single query text input field. This paper proposes a new method of instance-based query probing to identify WDBs' interface and result schema for SQI. The interface schema identification problem is defined as generating the full-condition query of SQI and a novel query probing strategy is proposed. The result schema is also identified based on the result webpages of SQI's full-condition query, and an extended identification of the non-query attributes is proposed to improve the attribute recall rate. Experimental results on web databases of online shopping for book, movie and mobile phone show that our method is effective and efficient.