2-PS based associative text classification

  • Authors:
  • Tieyun Qian;Yuanzhen Wang;Hao Long;Jianlin Feng

  • Affiliations:
  • Department of Computer Science, Huazhong University of Science and Technology, Wuhan, Hubei, China;Department of Computer Science, Huazhong University of Science and Technology, Wuhan, Hubei, China;Department of Computer Science, Huazhong University of Science and Technology, Wuhan, Hubei, China;Department of Computer Science, Huazhong University of Science and Technology, Wuhan, Hubei, China

  • Venue:
  • DaWaK'05 Proceedings of the 7th international conference on Data Warehousing and Knowledge Discovery
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Recent studies reveal that associative classification can achieve higher accuracy than traditional approaches. The main drawback of this approach is that it generates a huge number of rules, which makes it difficult to select a subset of rules for accurate classification. In this study, we propose a novel association-based approach especially suitable for text classification. The approach first builds a classifier through a 2-PS (Two-Phase) method. The first phase aims for pruning rules locally, i.e., rules mined within every category are pruned by a sentence-level constraint, and this makes the rules more semantically correlated and less redundant. In the second phase, all the remaining rules are compared and selected with a global view, i.e., training examples from different categories are merged together to evaluate these rules. Moreover, when labeling a new document, the multiple sentence-level appearances of a rule are taken into account. Experimental results on the well-known text corpora show that our method can achieve higher accuracy than many well-known methods. In addition, the performance study shows that our method is quite efficient in comparison with other classification methods.