A Multi-Label Chinese Text Categorization System Based on Boosting Algorithm

Authors:
Junli Chen;Xuezhong Zhou;Zhaohui Wu
Affiliations:
Zhejiang University;Zhejiang University;Zhejiang University
Venue:
CIT '04 Proceedings of the The Fourth International Conference on Computer and Information Technology
Year:
2004

Citing 0
Cited 2

Combining Subclassifiers in Text Categorization: A DST-Based Solution and a Case Study

IEEE Transactions on Knowledge and Data Engineering
INDUCTION FROM MULTI-LABEL EXAMPLES IN INFORMATION RETRIEVAL SYSTEMS: A CASE STUDY

Applied Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents a multi-label Chinese text categorization system based on Chinese character features and boosting algorithm. This system has been successfully evaluated on the TCM-MED dataset provided by China academy of Traditional Chinese Medicine (TCM) and the Reuters-21578 benchmark. We suggest that the TCM-MED dataset can be used as a standard corpus for the Chinese text categorization tasks. We have also carried out experiments to compare the performance of the boosting algorithm with two other traditional algorithms on the same datasets. The results indicate that for the design of a multi-label Chinese text categorization system, the boosting algorithm has a high performance and outperforms the other two algorithms.