A hybrid text classification system using sentential frequent itemsets

  • Authors:
  • Shizhu Liu;Heping Hu

  • Affiliations:
  • College of Computer Science, Huazhong University of Science and Technology, Wuhan, China;College of Computer Science, Huazhong University of Science and Technology, Wuhan, China

  • Venue:
  • CIS'05 Proceedings of the 2005 international conference on Computational Intelligence and Security - Volume Part I
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Text classification techniques mostly rely on single term analysis of the document data set, while more concepts especially the specific ones are usually conveyed by set of terms. To achieve more accurate text classifier, more informative feature including frequent co-occurring words in the same sentence and their weights are particularly important in such scenarios. In this paper, we propose a novel approach using sentential frequent itemset, a concept comes from association rule mining, for text classification, which views a sentence rather than a document as a transaction, and uses a variable precision rough set based method to evaluate each sentential frequent itemset’s contribution to the classification. Experiments over the Reuters corpus are carried out, which validate the practicability of the proposed system.