Supervised Semantic Indexing

Authors:
Bing Bai;Jason Weston;Ronan Collobert;David Grangier
Affiliations:
NEC Labs America, Princeton, USA 08540;NEC Labs America, Princeton, USA 08540;NEC Labs America, Princeton, USA 08540;NEC Labs America, Princeton, USA 08540
Venue:
ECIR '09 Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval
Year:
2009

Citing 9
Cited 2

Probabilistic latent semantic indexing

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Modern Information Retrieval

Modern Information Retrieval
Optimizing search engines using clickthrough data

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Latent dirichlet allocation

The Journal of Machine Learning Research
Inferring document similarity from hyperlinks

Proceedings of the 14th ACM international conference on Information and knowledge management
Learning to rank using gradient descent

ICML '05 Proceedings of the 22nd international conference on Machine learning
Feature Extraction: Foundations and Applications (Studies in Fuzziness and Soft Computing)

Feature Extraction: Foundations and Applications (Studies in Fuzziness and Soft Computing)
A Discriminative Kernel-Based Approach to Rank Images from Text Queries

IEEE Transactions on Pattern Analysis and Machine Intelligence
Query-drift prevention for robust query expansion

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval

Learning to rank with (a lot of) word features

Information Retrieval
Online learning in the embedded manifold of low-rank matrices

The Journal of Machine Learning Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a class of models that are discriminatively trained to directly map from the word content in a query-document or document- document pair to a ranking score. Like Latent Semantic Indexing (LSI), our models take account of correlations between words (synonymy, pol- ysemy). However, unlike LSI our models are trained with a supervised signal directly on the task of interest, which we argue is the reason for our superior results. We provide an empirical study on Wikipedia documents, using the links to define document-document or query-document pairs, where we obtain state-of-the-art performance using our method.