Rough Set-Aided Feature Selection for Automatic Web-Page Classification

  • Authors:
  • Toshiko Wakaki;Hiroyuki Itakura;Masaki Tamura

  • Affiliations:
  • Shibaura Institute of Technology, Japan;Shibaura Institute of Technology, Japan;Japan Advanced Institute of Science and Technology, Japan

  • Venue:
  • WI '04 Proceedings of the 2004 IEEE/WIC/ACM International Conference on Web Intelligence
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

Recently Web-pages on the World Wide Web are explosively increasing, and it is now required for portal sites such as Yahoo! service having directory-style search engines to classify Web-pages into many categories automatically. This paper investigates how rough settheory can help select relevant features for Web-page classification. Our experimental results show that the combination of the rough set-aided feature selection method and the Support Vector Machine with a linear kernel is quite useful in practice to classify Web-pages into many categories because not only the performance gives acceptable accuracy but also the high dimensionality reduction is achieved without depending on arbitrary thresholds for feature selection.