Automated learning of decision rules for text categorization
ACM Transactions on Information Systems (TOIS)
Improved Boosting Algorithms Using Confidence-rated Predictions
Machine Learning - The Eleventh Annual Conference on computational Learning Theory
BoosTexter: A Boosting-based Systemfor Text Categorization
Machine Learning - Special issue on information retrieval
An improved boosting algorithm and its application to text categorization
Proceedings of the ninth international conference on Information and knowledge management
An introduction to boosting and leveraging
Advanced lectures on machine learning
RCV1: A New Benchmark Collection for Text Categorization Research
The Journal of Machine Learning Research
A pitfall and solution in multi-class feature selection for text classification
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Supervised Textual Document Classification Using Neuronal Group Learning
WI-IAT '08 Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
Active Learning Strategies for Multi-Label Text Classification
ECIR '09 Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval
Encoding Ordinal Features into Binary Features for Text Classification
ECIR '09 Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval
Empirically building and evaluating a probabilistic model of user affect
User Modeling and User-Adapted Interaction
Training Data Cleaning for Text Classification
ICTIR '09 Proceedings of the 2nd International Conference on Theory of Information Retrieval: Advances in Information Retrieval Theory
ISTI@SemEval-2 task #8: Boosting-based multiway relation classification
SemEval '10 Proceedings of the 5th International Workshop on Semantic Evaluation
A utility-theoretic ranking method for semi-automated text classification
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Variable-constraint classification and quantification of radiology reports under the ACR Index
Expert Systems with Applications: An International Journal
Improving Text Classification Accuracy by Training Label Cleaning
ACM Transactions on Information Systems (TOIS)
Hi-index | 0.00 |
AdaBoost.MH is a popular supervised learning algorithm for building multi-label (aka n-of-m) text classifiers. AdaBoost.MH belongs to the family of “boosting” algorithms, and works by iteratively building a committee of “decision stump” classifiers, where each such classifier is trained to especially concentrate on the document-class pairs that previously generated classifiers have found harder to correctly classify. Each decision stump hinges on a specific “pivot term”, checking its presence or absence in the test document in order to take its classification decision. In this paper we propose an improved version of AdaBoost.MH, called MP-Boost, obtained by selecting, at each iteration of the boosting process, not one but several pivot terms, one for each category. The rationale behind this choice is that this provides highly individualized treatment for each category, since each iteration thus generates, for each category, the best possible decision stump. We present the results of experiments showing that MP-Boost is much more effective than AdaBoost.MH. In particular, the improvement in effectiveness is spectacular when few boosting iterations are performed, and (only) high for many such iterations. The improvement is especially significant in the case of macroaveraged effectiveness, which shows that MP-Boost is especially good at working with hard, infrequent categories.