Learning with support vector machines for query-by-multiple-examples

Authors:
Dell Zhang;Wee Sun Lee
Affiliations:
Birkbeck, University of London, London, United Kngdm;National University of Singapore, Singapore, Singapore
Venue:
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Year:
2008

Citing 5
Cited 0

Combining labeled and unlabeled data with co-training

COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
A support vector method for multivariate performance measures

ICML '05 Proceedings of the 22nd international conference on Machine learning
Training linear SVMs in linear time

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data (Data-Centric Systems and Applications)

Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data (Data-Centric Systems and Applications)
Introduction to Information Retrieval

Introduction to Information Retrieval

Quantified Score

Hi-index	0.01

Visualization

Abstract

We explore an alternative Information Retrieval paradigm called Query-By-Multiple-Examples (QBME) where the information need is described not by a set of terms but by a set of documents. Intuitive ideas for QBME include using the centroid of these documents or the well-known Rocchio algorithm to construct the query vector. We consider this problem from the perspective of text classification, and find that a better query vector can be obtained through learning with Support Vector Machines (SVMs). For online queries, we show how SVMs can be learned from one-class examples in linear time. For offline queries, we show how SVMs can be learned from positive and unlabeled examples together in linear or polynomial time. The effectiveness and efficiency of the proposed approaches have been confirmed by our experiments on four real-world datasets.