The automatic construction of large-scale corpora for summarization research
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Support vector machine active learning with applications to text classification
The Journal of Machine Learning Research
A family of additive online algorithms for category ranking
The Journal of Machine Learning Research
Active learning of label ranking functions
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Margin-sparsity trade-off for the set covering machine
ECML'05 Proceedings of the 16th European conference on Machine Learning
Generalization error bounds using unlabeled data
COLT'05 Proceedings of the 18th annual conference on Learning Theory
Optimizing estimated loss reduction for active sampling in rank learning
Proceedings of the 25th international conference on Machine learning
Active Sampling for Rank Learning via Optimizing the Area under the ROC Curve
ECIR '09 Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval
Relevant knowledge helps in choosing right teacher: active query selection for ranking adaptation
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Hi-index | 0.00 |
We propose a novel active learning strategy based on the compression framework of [9] for label ranking functions which, given an input instance, predict a total order over a predefined set of alternatives. Our approach is theoretically motivated by an extension to ranking and active learning of Kääriäinen's generalization bounds using unlabeled data [7], initially developed in the context of classification. The bounds we obtain suggest a selective sampling strategy provided that a sufficiently, yet reasonably large initial labeled dataset is provided. Experiments on Information Retrieval corpora from automatic text summarization and question/answering show that the proposed approach allows to substantially reduce the labeling effort in comparison to random and heuristic-based sampling strategies.