Hierarchical Document Classification Based on a Backtracking Algorithm

  • Authors:
  • Cuiling Zhu;Jun Ma;Dongmei Zhang;XiaoHui Han;Xiaofei Niu

  • Affiliations:
  • -;-;-;-;-

  • Venue:
  • FSKD '08 Proceedings of the 2008 Fifth International Conference on Fuzzy Systems and Knowledge Discovery - Volume 02
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Hierarchical document classification refers to assigning one or more suitable categories from a hierarchical category space to a document. This paper proposes a new hierarchical document classification method based on a backtracking algorithm. Utilizing the relationships betw- een categories in category tree, a suitable threshold for every category is found to determine whether a document could be classified into the category. And the backtracking algorithm in our hierarchical classification approach effectively solves the problem that a misclassification at higher level directly leads to the misclassification at a lower level. Moreover, feature set is selected by integrat- ing information gain with hierarchy information, which accords with the characteristic of a category tree. Experiments show that the method performs well when enough training documents are given.