Web database schema identification through simple query interface

  • Authors:
  • Ling Lin;Lizhu Zhou

  • Affiliations:
  • Department of Computer Science and Technology, Tsinghua University, Beijing, China;Department of Computer Science and Technology, Tsinghua University, Beijing, China

  • Venue:
  • RED'09 Proceedings of the 2nd international conference on Resource discovery
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Web databases provide different types of query interfaces to access the data records stored in the backend databases. While most existing works exploit a complex query interface with multiple input fields to perform schema identification of the Web databases, little attention has been paid on how to identify the schema of web databases by simple query interface (SQI), which has only one single query text input field. This paper proposes a new method of instance-based query probing to identify WDBs' interface and result schema for SQI. The interface schema identification problem is defined as generating the full-condition query of SQI and a novel query probing strategy is proposed. The result schema is also identified based on the result webpages of SQI's full-condition query, and an extended identification of the non-query attributes is proposed to improve the attribute recall rate. Experimental results on web databases of online shopping for book, movie and mobile phone show that our method is effective and efficient.