C4.5: programs for machine learning
C4.5: programs for machine learning
Inductive learning algorithms and representations for text categorization
Proceedings of the seventh international conference on Information and knowledge management
Pruning and summarizing the discovered associations
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
A Comparative Study on Feature Selection in Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
CMAR: Accurate and Efficient Classification Based on Multiple Class-Association Rules
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Text Document Categorization by Term Association
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
An extensive empirical study of feature selection metrics for text classification
The Journal of Machine Learning Research
XRules: an effective structural classifier for XML data
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
SAT-MOD: moderate itemset fittest for text classification
WWW '05 Special interest tracks and posters of the 14th international conference on World Wide Web
Hi-index | 0.00 |
Many studies have shown that association-based classification can achieve higher accuracy than traditional rule based schemes. However, when applied to text classification domain, the high dimensionality, the diversity of text data sets and the class skew make classification tasks more complicated. In this study, we present a new method for associative text categorization tasks. First,we integrate the feature selection into rule pruning process rather than a separate preprocess procedure. Second, we combine several techniques to efficiently extract rules. Third, a new score model is used to handle the problem caused by imbalanced class distribution. A series of experiments on various real text corpora indicate that by applying our approaches, associative text classification (ATC) can achieve as competitive classification performance as well-known support vector machines (SVM) do