A study on using two-phase conditional random fields for query interface segmentation

  • Authors:
  • Yongquan Dong;Xiangjun Zhao;Gongjie Zhang

  • Affiliations:
  • School of Computer Science and Technology, Xuzhou Normal University, Xuzhou, China;School of Computer Science and Technology, Xuzhou Normal University, Xuzhou, China;School of Computer Science and Technology, Xuzhou Normal University, Xuzhou, China

  • Venue:
  • WISM'11 Proceedings of the 2011 international conference on Web information systems and mining - Volume Part II
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Recently, the Web has been rapidly "deepened" by many searchable databases online, where data are hidden behind query interfaces. Automatic processing of a query interface is a must to access the invisible contents of deep Web. This entails automatic segmentation, i.e., the task of grouping related components of an interface together. The segmentation is divided into two steps: interface component labeling and interface component grouping. In this paper we present a new approach to perform query interface segmentation using two-phase Conditional Random Fields (CRFs). At the first phase, one CRFs model is used to tag each component with a semantic label (attribute-name, operator, operand or other); at the second phase, another CRFs model is used to create groups of related components. Experiments show that our approach yields high accuracy.