Pruning and summarizing the discovered associations
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Scalable association-based text classification
Proceedings of the ninth international conference on Information and knowledge management
Modern Information Retrieval
Scoring the Data Using Association Rules
Applied Intelligence
CMAR: Accurate and Efficient Classification Based on Multiple Class-Association Rules
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Building Hierarchical Classifiers Using Class Proximity
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Text Document Categorization by Term Association
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
XRules: an effective structural classifier for XML data
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Sentential association based text classification systems
APWeb'05 Proceedings of the 7th Asia-Pacific web conference on Web Technologies Research and Development
On the strength of hyperclique patterns for text categorization
Information Sciences: an International Journal
ACNB: Associative Classification Mining Based on Naïve Bayesian Method
International Journal of Information Technology and Web Engineering
Hi-index | 0.00 |
Recent studies reveal that associative classification can achieve higher accuracy than traditional approaches. The main drawback of this approach is that it generates a huge number of rules, which makes it difficult to select a subset of rules for accurate classification. In this study, we propose a novel association-based approach especially suitable for text classification. The approach first builds a classifier through a 2-PS (Two-Phase) method. The first phase aims for pruning rules locally, i.e., rules mined within every category are pruned by a sentence-level constraint, and this makes the rules more semantically correlated and less redundant. In the second phase, all the remaining rules are compared and selected with a global view, i.e., training examples from different categories are merged together to evaluate these rules. Moreover, when labeling a new document, the multiple sentence-level appearances of a rule are taken into account. Experimental results on the well-known text corpora show that our method can achieve higher accuracy than many well-known methods. In addition, the performance study shows that our method is quite efficient in comparison with other classification methods.