Text classification for DAG-Structured categories

  • Authors:
  • Cao D. Nguyen;Tran A. Dung;Tru H. Cao

  • Affiliations:
  • Faculty of Information Technology, Ho Chi Minh City University of Technology, Vietnam;Faculty of Information Technology, Ho Chi Minh City University of Technology, Vietnam;Faculty of Information Technology, Ho Chi Minh City University of Technology, Vietnam

  • Venue:
  • PAKDD'05 Proceedings of the 9th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Hierarchical text classification concerning the relationship among categories has become an interesting problem recently. Most research has focused on tree-structured categories, but in reality directed acyclic graph (DAG) – structured categories, where a child category may have more than one parent category, appear more often. In this paper, we introduce three approaches, namely, flat, tree-based, and DAG-based, for solving the multi-label text classification problem in which categories are organized as a DAG, and documents are classified into both leaf and internal categories. We also present experimental results of the methods using SVMs as classifiers on the Reuters-21578 collection and our data set of research papers in Artificial Intelligence.