Web document classification using changing training data set

  • Authors:
  • Gilcheol Park;Seoksoo Kim

  • Affiliations:
  • Dept.of Multimedia Engineering, Hannam University, Daejeon, South Korea;Dept.of Multimedia Engineering, Hannam University, Daejeon, South Korea

  • Venue:
  • ICCSA'06 Proceedings of the 2006 international conference on Computational Science and Its Applications - Volume Part V
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Machine learning methods are generally employed to acquire the knowledge for automated document classification. They can be used if a large set of pre-sampled training set is available and the domain does not change rapidly. However, it is not easy to get a complete trained data set in the real world. Furthermore, the classification knowledge continually changes in different situations. This is known as the maintenance problem or knowledge acquisition bottleneck problem. Multiple Classification Ripple-Down Rules (MCRDR), an incremental knowledge acquisition method, was introduced to resolve this problem and has been applied in several commercial expert systems and a document classification system. Evaluation results for several domains show that our MCRDR based document classification method can be successfully applied in the real world document classification task.