Learning to rank with (a lot of) word features

Authors:
Bing Bai;Jason Weston;David Grangier;Ronan Collobert;Kunihiko Sadamasa;Yanjun Qi;Olivier Chapelle;Kilian Weinberger
Affiliations:
NEC Labs America, Princeton, USA;NEC Labs America, Princeton, USA;NEC Labs America, Princeton, USA;NEC Labs America, Princeton, USA;NEC Labs America, Princeton, USA;NEC Labs America, Princeton, USA;Yahoo! Research, Santa Clara, USA;Yahoo! Research, Santa Clara, USA
Venue:
Information Retrieval
Year:
2010

Citing 27
Cited 12

Translating collocations for bilingual lexicons: a statistical approach

Computational Linguistics
Probabilistic latent semantic indexing

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Information retrieval as statistical translation

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Cross-Language Information Retrieval

Cross-Language Information Retrieval
Modern Information Retrieval

Modern Information Retrieval
Optimizing search engines using clickthrough data

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Latent dirichlet allocation

The Journal of Machine Learning Research
Links between perceptrons, MLPs and SVMs

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Supervised Latent Semantic Indexing for Document Categorization

ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
New ranking algorithms for parsing and tagging: kernels over discrete structures, and the voted perceptron

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Inferring document similarity from hyperlinks

Proceedings of the 14th ACM international conference on Information and knowledge management
Learning to rank using gradient descent

ICML '05 Proceedings of the 22nd international conference on Machine learning
The rate adapting poisson model for information retrieval and object recognition

ICML '06 Proceedings of the 23rd international conference on Machine learning
Latent semantic analysis for multiple-type interrelated data objects

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Feature Extraction: Foundations and Applications (Studies in Fuzziness and Soft Computing)

Feature Extraction: Foundations and Applications (Studies in Fuzziness and Soft Computing)
Learning to rank: from pairwise approach to listwise approach

Proceedings of the 24th international conference on Machine learning
A support vector method for optimizing average precision

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
A knowledge-based search engine powered by wikipedia

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Wikipedia-Based Kernels for Text Categorization

SYNASC '07 Proceedings of the Ninth International Symposium on Symbolic and Numeric Algorithms for Scientific Computing
A Discriminative Kernel-Based Approach to Rank Images from Text Queries

IEEE Transactions on Pattern Analysis and Machine Intelligence
Fast solvers and efficient implementations for distance metric learning

Proceedings of the 25th international conference on Machine learning
Enhancing text clustering by leveraging Wikipedia semantics

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Query-drift prevention for robust query expansion

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Supervised Semantic Indexing

ECIR '09 Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval
Computing semantic relatedness using Wikipedia-based explicit semantic analysis

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
A neural network for text representation

ICANN'05 Proceedings of the 15th international conference on Artificial neural networks: formal models and their applications - Volume Part II
Automatic extraction of semantic relationships for wordnet by means of pattern learning from wikipedia

NLDB'05 Proceedings of the 10th international conference on Natural Language Processing and Information Systems

Data-aware scheduling of legacy kernels on heterogeneous platforms with distributed memory

Proceedings of the twenty-second annual ACM symposium on Parallelism in algorithms and architectures
A programmable parallel accelerator for learning and classification

Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Active learning to maximize accuracy vs. effort in interactive information retrieval

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
A Massively Parallel, Energy Efficient Programmable Accelerator for Learning and Classification

ACM Transactions on Architecture and Code Optimization (TACO)
Extending BM25 with multiple query operators

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Methodologies for improved tag cloud generation with clustering

ICWE'12 Proceedings of the 12th international conference on Web Engineering
Low-dimensional discriminative reranking

NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Metric learning for large scale image classification: generalizing to new classes at near-zero cost

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part II
COSMIC: middleware for high performance and reliable multiprocessing on xeon phi coprocessors

Proceedings of the 22nd international symposium on High-performance parallel and distributed computing
A low rank structural large margin method for cross-modal ranking

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Cross-media semantic representation via bi-directional learning to rank

Proceedings of the 21st ACM international conference on Multimedia
Computing text semantic relatedness using the contents and links of a hypertext encyclopedia: extended abstract

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this article we present Supervised Semantic Indexing which defines a class of nonlinear (quadratic) models that are discriminatively trained to directly map from the word content in a query-document or document-document pair to a ranking score. Like Latent Semantic Indexing (LSI), our models take account of correlations between words (synonymy, polysemy). However, unlike LSI our models are trained from a supervised signal directly on the ranking task of interest, which we argue is the reason for our superior results. As the query and target texts are modeled separately, our approach is easily generalized to different retrieval tasks, such as cross-language retrieval or online advertising placement. Dealing with models on all pairs of words features is computationally challenging. We propose several improvements to our basic model for addressing this issue, including low rank (but diagonal preserving) representations, correlated feature hashing and sparsification. We provide an empirical study of all these methods on retrieval tasks based on Wikipedia documents as well as an Internet advertisement task. We obtain state-of-the-art performance while providing realistically scalable methods.