Inductive learning algorithms and representations for text categorization
Proceedings of the seventh international conference on Information and knowledge management
Foundations of statistical natural language processing
Foundations of statistical natural language processing
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
Reducing Multiclass to Binary: A Unifying Approach for Margin Classifiers
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
On the Learnability and Design of Output Codes for Multiclass Problems
COLT '00 Proceedings of the Thirteenth Annual Conference on Computational Learning Theory
Authorship Attribution with Support Vector Machines
Applied Intelligence
A New Multi-Class SVM Based on a Uniform Convergence Result
IJCNN '00 Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks (IJCNN'00)-Volume 4 - Volume 4
Solving multiclass learning problems via error-correcting output codes
Journal of Artificial Intelligence Research
Industrial Conference on Data Mining: Advances in Data Mining, Applications in E-Commerce, Medicine, and Knowledge Management
An automatic diagnosis system based on thyroid gland: ADSTG
Expert Systems with Applications: An International Journal
Expert Systems with Applications: An International Journal
A New Expert System for Diagnosis of Lung Cancer: GDA--LS_SVM
Journal of Medical Systems
Hi-index | 0.01 |
We extend a multi-class categorization scheme proposed by Dietterich and Bakiri 1995 for binary classifiers, using error correcting codes. The extension comprises the computation of the codes by a simulated annealing algorithm and optimization of Kullback-Leibler (KL) category distances within the code-words. For the first time, we apply the scheme to text categorization with support vector machines (SVMs) on several large text corpora with more than 100 categories. The results are compared to 1-of-N coding (i.e. one SVM for each text category). We also investigate codes with optimized KL distance between the text categories which are merged in the code-words. We find that error correcting codes perform better than 1-of-N coding with increasing code length. For very long codes, the performance is in some cases further improved by KL-distance optimization.