Automatic Web Page Classification in a Dynamic and Hierarchical Way

  • Authors:
  • XIAOGANG PENG;BEN CHOI

  • Affiliations:
  • -;-

  • Venue:
  • ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

Automatic classification of web pages is an effectiveway to deal with the difficulty of retrieving informationfrom the Internet. Although there are many automaticclassification algorithms and systems that have beenproposed, most of them ignore the conflict between thefixed number of categories and the growing number ofweb pages going into the system. They also requiresearching through all existing categories to make anyclassification. We propose a dynamic and hierarchicalclassification system that is capable of adding newcategories as required, organizing the web pages into atree structure, and classifying web pages by searchingthrough only one path of the tree structure. Our testresults show that our proposed single-path searchtechnique reduces the search complexity and increasesthe accuracy by 6% comparing to related algorithms. Ourdynamic-category expansion technique also achievessatisfying results on adding new categories into oursystem as required.