Understanding query interfaces by statistical parsing

  • Authors:
  • Weifeng Su;Hejun Wu;Yafei Li;Jing Zhao;Frederick H. Lochovsky;Hongmin Cai;Tianqiang Huang

  • Affiliations:
  • BNU-HKBU United International College and Shenzhen Key Laboratory of Intelligent Media and Speech, PKU-HKUST Shenzhen Hong Kong Institution;Sun Yat-Sen University;BNU-HKBU United International College;BNU-HKBU United International College;The Hong Kong University of Science and Technology;South China University of Technology;Fujian Normal University

  • Venue:
  • ACM Transactions on the Web (TWEB)
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Users submit queries to an online database via its query interface. Query interface parsing, which is important for many applications, understands the query capabilities of a query interface. Since most query interfaces are organized hierarchically, we present a novel query interface parsing method, StatParser (Statistical Parser), to automatically extract the hierarchical query capabilities of query interfaces. StatParser automatically learns from a set of parsed query interfaces and parses new query interfaces. StatParser starts from a small grammar and enhances the grammar with a set of probabilities learned from parsed query interfaces under the maximum-entropy principle. Given a new query interface, the probability-enhanced grammar identifies the parse tree with the largest global probability to be the query capabilities of the query interface. Experimental results show that StatParser very accurately extracts the query capabilities and can effectively overcome the problems of existing query interface parsers.