Translating collocations for bilingual lexicons: a statistical approach
Computational Linguistics
Probabilistic latent semantic indexing
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Modern Information Retrieval
Optimizing search engines using clickthrough data
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
The Journal of Machine Learning Research
Supervised Latent Semantic Indexing for Document Categorization
ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Inferring document similarity from hyperlinks
Proceedings of the 14th ACM international conference on Information and knowledge management
Learning to rank using gradient descent
ICML '05 Proceedings of the 22nd international conference on Machine learning
The rate adapting poisson model for information retrieval and object recognition
ICML '06 Proceedings of the 23rd international conference on Machine learning
Learning to rank: from pairwise approach to listwise approach
Proceedings of the 24th international conference on Machine learning
A support vector method for optimizing average precision
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
A knowledge-based search engine powered by wikipedia
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Wikipedia-Based Kernels for Text Categorization
SYNASC '07 Proceedings of the Ninth International Symposium on Symbolic and Numeric Algorithms for Scientific Computing
A Discriminative Kernel-Based Approach to Rank Images from Text Queries
IEEE Transactions on Pattern Analysis and Machine Intelligence
Fast solvers and efficient implementations for distance metric learning
Proceedings of the 25th international conference on Machine learning
Enhancing text clustering by leveraging Wikipedia semantics
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Query-drift prevention for robust query expansion
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
International Journal of Approximate Reasoning
Computing semantic relatedness using Wikipedia-based explicit semantic analysis
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
A neural network for text representation
ICANN'05 Proceedings of the 15th international conference on Artificial neural networks: formal models and their applications - Volume Part II
NLDB'05 Proceedings of the 10th international conference on Natural Language Processing and Information Systems
Best-effort semantic document search on GPUs
Proceedings of the 3rd Workshop on General-Purpose Computation on Graphics Processing Units
Decomposing background topics from keywords by principal component pursuit
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Learning similarity function for rare queries
Proceedings of the fourth ACM international conference on Web search and data mining
On inferring image label information using rank minimization for supervised concept embedding
SCIA'11 Proceedings of the 17th Scandinavian conference on Image analysis
Sentiment classification based on supervised latent n-gram analysis
Proceedings of the 20th ACM international conference on Information and knowledge management
From sBoW to dCoT marginalized encoders for text representation
Proceedings of the 21st ACM international conference on Information and knowledge management
Learning to match images in large-scale collections
ECCV'12 Proceedings of the 12th international conference on Computer Vision - Volume Part I
Data-driven vehicle identification by image matching
ECCV'12 Proceedings of the 12th international conference on Computer Vision - Volume 2
Accelerating text mining workloads in a MapReduce-based distributed GPU environment
Journal of Parallel and Distributed Computing
Learning bilinear model for matching queries and documents
The Journal of Machine Learning Research
Hi-index | 0.00 |
In this article we propose Supervised Semantic Indexing (SSI), an algorithm that is trained on (query, document) pairs of text documents to predict the quality of their match. Like Latent Semantic Indexing (LSI), our models take account of correlations between words (synonymy, polysemy). However, unlike LSI our models are trained with a supervised signal directly on the ranking task of interest, which we argue is the reason for our superior results. As the query and target texts are modeled separately, our approach is easily generalized to different retrieval tasks, such as online advertising placement. Dealing with models on all pairs of words features is computationally challenging. We propose several improvements to our basic model for addressing this issue, including low rank (but diagonal preserving) representations, and correlated feature hashing (CFH). We provide an empirical study of all these methods on retrieval tasks based on Wikipedia documents as well as an Internet advertisement task. We obtain state-of-the-art performance while providing realistically scalable methods.