Practical application of associative classifier for document classification

Authors:
Yongwook Yoon;Gary Geunbae Lee
Affiliations:
Department of Computer Science & Engineering, Pohang University of Science & Technology, Pohang, South Korea;Department of Computer Science & Engineering, Pohang University of Science & Technology, Pohang, South Korea
Venue:
AIRS'05 Proceedings of the Second Asia conference on Asia Information Retrieval Technology
Year:
2005

Citing 7
Cited 5

Elements of information theory

Elements of information theory
Mining frequent patterns without candidate generation

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
On feature distributional clustering for text categorization

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Machine learning in automated text categorization

ACM Computing Surveys (CSUR)
CMAR: Accurate and Efficient Classification Based on Multiple Class-Association Rules

ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Systematic construction of hierarchical classifier in SVM-Based text categorization

IJCNLP'04 Proceedings of the First international joint conference on Natural Language Processing

Application of Classification Association Rule Mining for Mammalian Mesenchymal Stem Cell Differentiation

ICDM '09 Proceedings of the 9th Industrial Conference on Advances in Data Mining. Applications and Theoretical Aspects
A Hybrid Statistical Data Pre-processing Approach for Language-Independent Text Classification

ADMA '09 Proceedings of the 5th International Conference on Advanced Data Mining and Applications
Hybrid DIAAF/RS: statistical textual feature selection for language-independent text classification

ICDM'10 Proceedings of the 10th industrial conference on Advances in data mining: applications and theoretical aspects
Classification based on specific rules and inexact coverage

Expert Systems with Applications: An International Journal
CAR-NF: A classifier based on specific rules with high netconf

Intelligent Data Analysis

Quantified Score

Hi-index	0.00

Visualization

Abstract

In practical text classification tasks, the ability to interpret the classification result is as important as the ability to classify exactly. The associative classifier has favorable characteristics, rapid training, good classification accuracy, and excellent interpretation. However, the associative classifier has some obstacles to overcome when it is applied in the area of text classification. First of all, the training process of the associative classifier produces a huge amount of classification rules, which makes the prediction for a new document ineffective. We resolve this by pruning the rules according to their contribution to correct classifications. In addition, since the target text collection generally has a high dimension, the training process might take a very long time. We propose mutual information between the word and class variables as a feature selection measure to reduce the space dimension. Experimental classification results using the 20-newsgroups dataset show many benefits of the associative classification in both training and predicting.