Association classification based on sample weighting

Authors:
Jin Zhang;Xiaoyun Chen;Yi Chen;Yunfa Hu
Affiliations:
Department of Computer & Information Technology, Fudan University, Shanghai, China;,Department of Computer & Information Technology, Fudan University, Shanghai, China;Department of Computer & Information Technology, Fudan University, Shanghai, China;Department of Computer & Information Technology, Fudan University, Shanghai, China
Venue:
FSKD'05 Proceedings of the Second international conference on Fuzzy Systems and Knowledge Discovery - Volume Part II
Year:
2005

Citing 8
Cited 0

An example-based mapping method for text categorization and retrieval

ACM Transactions on Information Systems (TOIS)
A re-examination of text categorization methods

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Mining frequent patterns without candidate generation

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Classifying text documents by associating terms with text categories

ADC '02 Proceedings of the 13th Australasian database conference - Volume 5
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features

ECML '98 Proceedings of the 10th European Conference on Machine Learning
A Comparative Study on Feature Selection in Text Categorization

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
CMAR: Accurate and Efficient Classification Based on Multiple Class-Association Rules

ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases

Quantified Score

Hi-index	0.00

Visualization

Abstract

In the territory of text categorization, the distribution and quality of sample set is highly influential to categorization result. Associated rule categorization ARC-BC is effective under common circumstances. The accuracy of categorization obviously falls as distribution of feature words of training samples is uneven. In this paper, a Chinese text classification approach was proposed based on sample weighting associated rules (SW-ARC). The approach improved substantial classification efficiency by performing self-adapting sample weights adjustment. Experiment result shows SW-ARC can solve the quality fall caused by uneven distribution of feature words. Macro-average recall of open test increases from 50% of ARC-BC to 70% of SW-ARC, Macro-average precision increases from 28% of ARC-BC to 70% of SW-ARC.