Schema Extraction of Deep Web Query Interface

  • Authors:
  • Ying Wang;Tao Peng;Wanli Zuo;Huifeng Zhu

  • Affiliations:
  • -;-;-;-

  • Venue:
  • WISM '09 Proceedings of the 2009 International Conference on Web Information Systems and Mining
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

For integrating web databases, the very first challenge is to understand what a query interface says or what query capabilities a source supports. From the view of people, the interior structure of web pages is not concerned to for people. In the most cases, semantic block is identified via visual elements. Therefore, in this paper, a novel arithmetic of schema extraction based on visual features of pages has been designed to grasp and analyze attributes and query controls of pages. Firstly, judge query interface region by heuristic rules; Then, parse the interface region by analytic algorithm of pages; Lastly, deal with the query interface region to get logical attributes by visual features of pages, which are shown by a link list. Experiment result shows that this method has dramatically improved the extraction precision of query schema.