Using hypothesis margin to boost centroid text classifier

Authors:
Songbo Tan;Xueqi Cheng
Affiliations:
ICT, Beijing, China;ICT, Beijing, China
Venue:
Proceedings of the 2007 ACM symposium on Applied computing
Year:
2007

Citing 20
Cited 4

Term-weighting approaches in automatic text retrieval

Information Processing and Management: an International Journal
Classifying news stories using memory based reasoning

SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
OHSUMED: an interactive retrieval evaluation and new large test collection for research

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Support-Vector Networks

Machine Learning
Training algorithms for linear text classifiers

SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Boosting and Rocchio applied to text filtering

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
A re-examination of text categorization methods

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
BoosTexter: A Boosting-based Systemfor Text Categorization

Machine Learning - Special issue on information retrieval
An Evaluation of Statistical Approaches to Text Categorization

Information Retrieval
A study of thresholding strategies for text categorization

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Machine learning in automated text categorization

ACM Computing Surveys (CSUR)
Boosting to correct inductive bias in text classification

Proceedings of the eleventh international conference on Information and knowledge management
Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval

ECML '98 Proceedings of the 10th European Conference on Machine Learning
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features

ECML '98 Proceedings of the 10th European Conference on Machine Learning
A Comparative Study on Feature Selection in Text Categorization

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categorization

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
A refinement approach to handling model misfit in text categorization

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Scaling multi-class support vector machines using inter-class confusion

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
A New Probabilistic Model of Text Classification and Retrieval TITLE2:

A New Probabilistic Model of Text Classification and Retrieval TITLE2:
Margin based feature selection - theory and algorithms

ICML '04 Proceedings of the twenty-first international conference on Machine learning

A class-feature-centroid classifier for text categorization

Proceedings of the 18th international conference on World wide web
Adapting centroid classifier for document categorization

Expert Systems with Applications: An International Journal
Macro features based text categorization

ICONIP'11 Proceedings of the 18th international conference on Neural Information Processing - Volume Part II
Towards enhancing centroid classifier for text classification-A border-instance approach

Neurocomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Centroid Classifier is a simple and yet efficient method for text categorization. However it often suffers from the inductive bias or model misfit incurred by its assumption. In order to address this issue, training-set errors as well as training-set margins are regarded as training criterions. Based on these two criterions, an overall (or global) objective function over all training examples is constructed, and optimized to produce a refined Centroid classification model. The empirical assessment conducted on four benchmark collections evidence that proposed method performs comparably to state-of-the-art SVM classifier in classifying performance, as well as beats it in running time.