Machine learning methods for Chinese web page categorization

  • Authors:
  • Ji He;Ah-Hwee Tan;Chew-Lim Tan

  • Affiliations:
  • National University of Singapore, Singapore;Kent Ridge Digital Labs, Singapore;National University of Singapore, Singapore

  • Venue:
  • CLPW '00 Proceedings of the second workshop on Chinese language processing: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 12
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper reports our evaluation of k Nearest Neighbor (kNN), Support Vector Machines (SVM), and Adaptive Resonance Associative Map (ARAM) on Chinese web page classification. Benchmark experiments based on a Chinese web corpus showed that their predictive performance were roughly comparable although ARAM and kNN slightly outperformed SVM in small categories. In addition, inserting rules into ARAM helped to improve performance, especially for small well-defined categories.