Feature selection, rule extraction, and score model: making ATC competitive with SVM

  • Authors:
  • Tieyun Qian;Yuanzhen Wang;Langgang Xiang;WeiHua Gong

  • Affiliations:
  • Department of Computer Science, Huazhong University of Science and Technology, Wuhan, P.R. China;Department of Computer Science, Huazhong University of Science and Technology, Wuhan, P.R. China;Department of Computer Science, Huazhong University of Science and Technology, Wuhan, P.R. China;Department of Computer Science, Huazhong University of Science and Technology, Wuhan, P.R. China

  • Venue:
  • RSKT'06 Proceedings of the First international conference on Rough Sets and Knowledge Technology
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Many studies have shown that association-based classification can achieve higher accuracy than traditional rule based schemes. However, when applied to text classification domain, the high dimensionality, the diversity of text data sets and the class skew make classification tasks more complicated. In this study, we present a new method for associative text categorization tasks. First,we integrate the feature selection into rule pruning process rather than a separate preprocess procedure. Second, we combine several techniques to efficiently extract rules. Third, a new score model is used to handle the problem caused by imbalanced class distribution. A series of experiments on various real text corpora indicate that by applying our approaches, associative text classification (ATC) can achieve as competitive classification performance as well-known support vector machines (SVM) do