An example-based mapping method for text categorization and retrieval
ACM Transactions on Information Systems (TOIS)
A re-examination of text categorization methods
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Mining frequent patterns without candidate generation
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Classifying text documents by associating terms with text categories
ADC '02 Proceedings of the 13th Australasian database conference - Volume 5
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
A Comparative Study on Feature Selection in Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
CMAR: Accurate and Efficient Classification Based on Multiple Class-Association Rules
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Hi-index | 0.00 |
In the territory of text categorization, the distribution and quality of sample set is highly influential to categorization result. Associated rule categorization ARC-BC is effective under common circumstances. The accuracy of categorization obviously falls as distribution of feature words of training samples is uneven. In this paper, a Chinese text classification approach was proposed based on sample weighting associated rules (SW-ARC). The approach improved substantial classification efficiency by performing self-adapting sample weights adjustment. Experiment result shows SW-ARC can solve the quality fall caused by uneven distribution of feature words. Macro-average recall of open test increases from 50% of ARC-BC to 70% of SW-ARC, Macro-average precision increases from 28% of ARC-BC to 70% of SW-ARC.