A SVM method for web page categorization based on weight adjustment and boosting mechanism

  • Authors:
  • Mingyu Lu;Chonghui Guo;Jiantao Sun;Yuchang Lu

  • Affiliations:
  • ,Institute of Computer Science and Technology, Dalian Maritime University, Dalian, China;Department of Computer Science and Technology, Tsinghua University, Beijing, China;Department of Computer Science and Technology, Tsinghua University, Beijing, China;Department of Computer Science and Technology, Tsinghua University, Beijing, China

  • Venue:
  • FSKD'05 Proceedings of the Second international conference on Fuzzy Systems and Knowledge Discovery - Volume Part II
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Web page classification is an important research direction of web mining. In the paper, a SVM method of web page classification is presented. It include four steps: (1) using analysis module to extract the core text and structural tags from a web page; (2) adopting the improved VSM model to generate the initial feature vectors based on the core text of web page; (3) adjusting weights of the selected features based on structural tags in web page to generate the base SVM classifier; (4) combining the base classifiers produced by iteration based on Boosting mechanism to obtain the target SVM classifier. The experiment of web page classification shows that the approach presented is efficient.