The nature of statistical learning theory
The nature of statistical learning theory
Training algorithms for linear text classifiers
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
A re-examination of text categorization methods
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
BoosTexter: A Boosting-based Systemfor Text Categorization
Machine Learning - Special issue on information retrieval
Boosting to correct inductive bias in text classification
Proceedings of the eleventh international conference on Information and knowledge management
Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval
ECML '98 Proceedings of the 10th European Conference on Machine Learning
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
A Comparative Study on Feature Selection in Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
A refinement approach to handling model misfit in text categorization
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
A novel refinement approach for text categorization
Proceedings of the 14th ACM international conference on Information and knowledge management
Rough set based hybrid algorithm for text classification
Expert Systems with Applications: An International Journal
A class-feature-centroid classifier for text categorization
Proceedings of the 18th international conference on World wide web
Information Processing and Management: an International Journal
A high performance centroid-based classification approach for language identification
Pattern Recognition Letters
Hi-index | 12.05 |
Among all conventional methods for text categorization, centroid classifier is a simple and efficient method. However it often suffers from inductive bias (or model misfit) incurred by its assumption. DragPushing is a very simple and yet efficient method to address this so-called inductive bias problem. However, DragPushing employs only one criterion, i.e., training-set error, as its objective function that cannot guarantee the generalization capability. In this paper, we propose a generalized DragPushing strategy for centroid classifier, which we called as ''Large Margin DragPushing'' (LMDP). The experiments conducted on three benchmark evaluation collections show that LMDP achieved about one percent improvement over the performance of DragPushing and delivered top performance nearly as well as state-of-the-art SVM without incurring significant computational costs.