A comparative study on classifying the functions of web page blocks

  • Authors:
  • Xiangye Xiao;Qiong Luo;Xing Xie;Wei-Ying Ma

  • Affiliations:
  • Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong;Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong;Microsoft Research Asia, Beijing, P.R. China;Microsoft Research Asia, Beijing, P.R. China

  • Venue:
  • CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we study the problem of learning block classification models to estimate block functions. We distinguish general models, which are learned across multiple sites, and site-specific models, which are learned within individual sites. We further consider several factors that affect the learning process and model effectiveness. These factors include the layout features, the content features, the classifiers, and the term selection methods. We have empirically evaluated the performance of the models when the factors are varied. Our main results are that layout features do better than content features for learning both general and site-specific models.