Elements of information theory
Elements of information theory
Mining frequent patterns without candidate generation
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
On feature distributional clustering for text categorization
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
CMAR: Accurate and Efficient Classification Based on Multiple Class-Association Rules
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Systematic construction of hierarchical classifier in SVM-Based text categorization
IJCNLP'04 Proceedings of the First international joint conference on Natural Language Processing
ICDM '09 Proceedings of the 9th Industrial Conference on Advances in Data Mining. Applications and Theoretical Aspects
A Hybrid Statistical Data Pre-processing Approach for Language-Independent Text Classification
ADMA '09 Proceedings of the 5th International Conference on Advanced Data Mining and Applications
Hybrid DIAAF/RS: statistical textual feature selection for language-independent text classification
ICDM'10 Proceedings of the 10th industrial conference on Advances in data mining: applications and theoretical aspects
Classification based on specific rules and inexact coverage
Expert Systems with Applications: An International Journal
CAR-NF: A classifier based on specific rules with high netconf
Intelligent Data Analysis
Hi-index | 0.00 |
In practical text classification tasks, the ability to interpret the classification result is as important as the ability to classify exactly. The associative classifier has favorable characteristics, rapid training, good classification accuracy, and excellent interpretation. However, the associative classifier has some obstacles to overcome when it is applied in the area of text classification. First of all, the training process of the associative classifier produces a huge amount of classification rules, which makes the prediction for a new document ineffective. We resolve this by pruning the rules according to their contribution to correct classifications. In addition, since the target text collection generally has a high dimension, the training process might take a very long time. We propose mutual information between the word and class variables as a feature selection measure to reduce the space dimension. Experimental classification results using the 20-newsgroups dataset show many benefits of the associative classification in both training and predicting.