Closing the loop in webpage understanding

  • Authors:
  • Chunyu Yang;Yong Cao;Zaiqing Nie;Jie Zhou;Ji-Rong Wen

  • Affiliations:
  • Tsinghua University, Beijing, China;Microsoft Research Asia, Beijing, China;Microsoft Research Asia, Beijing, China;Tsinghua University, Beijing, China;Microsoft Research Asia, Beijing, China

  • Venue:
  • Proceedings of the 17th ACM conference on Information and knowledge management
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Little work has been done towards an integrated statistical model for understanding webpage structures and processing natural language sentences within the HTML elements. This paper proposed a novel framework called WebNLP which enables bidirectional integration of page structure understanding and text understanding in an iterative manner. Experiments show that the WebNLP framework achieved significantly better performance.