Original Contribution: Stacked generalization
Neural Networks
Machine Learning
Combining labeled and unlabeled data with co-training
COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
An adaptive version of the boost by majority algorithm
COLT '99 Proceedings of the twelfth annual conference on Computational learning theory
Foundations of statistical natural language processing
Foundations of statistical natural language processing
Hierarchical classification of Web content
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Proceedings of the tenth international conference on Information and knowledge management
Methods and metrics for cold-start recommendations
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
A Tutorial on Support Vector Machines for Pattern Recognition
Data Mining and Knowledge Discovery
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Reducing Multiclass to Binary: A Unifying Approach for Margin Classifiers
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Estimating the Generalization Performance of an SVM Efficiently
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Heterogeneous Learner for Web Page Classification
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
An extensible meta-learning approach for scalable and accurate inductive learning
An extensible meta-learning approach for scalable and accurate inductive learning
Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
Mining concept-drifting data streams using ensemble classifiers
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
The database research group at the Max-Planck Institute for Informatics
ACM SIGMOD Record
Meta methods for model sharing in personal information systems
ACM Transactions on Information Systems (TOIS)
Using restrictive classification and meta classification for junk elimination
ECIR'05 Proceedings of the 27th European conference on Advances in Information Retrieval Research
Automatic document organization in a p2p environment
ECIR'06 Proceedings of the 28th European conference on Advances in Information Retrieval
Hi-index | 0.00 |
Automatic text classification methods come with various calibration parameters such as thresholds for probabilities in Bayesian classifiers or for hyperplane distances in SVM classifiers. In a given application context these parameters should be set so as to meet the relative importance of various result quality metrics such as precision versus recall. In this paper we consider classifiers that can accept a document for a topic, reject it, or abstain. We aim to meet the application's goals in terms of accuracy (i.e., avoid false acceptances or rejections) and loss (i.e., limit the fraction of documents for which no decision is made). To this end we investigate restrictive forms of Support Vector Machine classifiers and we develop meta methods that split the training data into subsets for independently trained classifiers and then combine the results of these classifiers. These techniques tend to improve accuracy at the expense of document loss. We develop estimators that help to predict the accuracy and loss for a given setting of the methods' tuning parameters, and a methodology for efficiently deriving a setting that meets the application's goals. Our experiments confirm the practical viability of the approach.