Term-weighting approaches in automatic text retrieval
Information Processing and Management: an International Journal
The Strength of Weak Learnability
Machine Learning
Original Contribution: Stacked generalization
Neural Networks
C4.5: programs for machine learning
C4.5: programs for machine learning
Experiments on multistrategy learning by meta-learning
CIKM '93 Proceedings of the second international conference on Information and knowledge management
Boosting a weak learning algorithm by majority
Information and Computation
A decision-theoretic generalization of on-line learning and an application to boosting
Journal of Computer and System Sciences - Special issue: 26th annual ACM symposium on the theory of computing & STOC'94, May 23–25, 1994, and second annual Europe an conference on computational learning theory (EuroCOLT'95), March 13–15, 1995
A re-examination of text categorization methods
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
BoosTexter: A Boosting-based Systemfor Text Categorization
Machine Learning - Special issue on information retrieval
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
Machine Learning
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Clinical and financial outcomes analysis with existing hospital patient records
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Combining Pattern Classifiers: Methods and Algorithms
Combining Pattern Classifiers: Methods and Algorithms
Message Understanding Conference-6: a brief history
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
In Defense of One-Vs-All Classification
The Journal of Machine Learning Research
A Mathematical Theory of Communication
A Mathematical Theory of Communication
Bias Analysis in Text Classification for Highly Skewed Data
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Extracting contrastive information from negation patterns in biomedical literature
ACM Transactions on Asian Language Information Processing (TALIP)
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Expert Systems with Applications: An International Journal
A 'No Panacea Theorem' for classifier combination
Pattern Recognition
CSMC: A combination strategy for multi-class classification based on multiple association rules
Knowledge-Based Systems
Expert Systems with Applications: An International Journal
A shared task involving multi-label classification of clinical free text
BioNLP '07 Proceedings of the Workshop on BioNLP 2007: Biological, Translational, and Clinical Language Processing
Guest Editorial: Current issues in biomedical text mining and natural language processing
Journal of Biomedical Informatics
Hi-index | 0.00 |
We present specializing, a method for combining classifiers for multi-class classification. Specializing trains one specialist classifier per class and utilizes each specialist to distinguish that class from all others in a one-versus-all manner. It then supplements the specialist classifiers with a catch-all classifier that performs multi-class classification across all classes. We refer to the resulting combined classifier as a specializing classifier. We develop specializing to classify 16 diseases based on discharge summaries. For each discharge summary, we aim to predict whether each disease is present, absent, or questionable in the patient, or unmentioned in the discharge summary. We treat the classification of each disease as an independent multi-class classification task. For each disease, we develop one specialist classifier for each of the present, absent, questionable, and unmentioned classes; we supplement these specialist classifiers with a catch-all classifier that encompasses all of the classes for that disease. We evaluate specializing on each of the 16 diseases and show that it improves significantly over voting and stacking when used for multi-class classification on our data.