Robust Word Similarity Estimation Using Perturbation Kernels

Authors:
Kevyn Collins-Thompson
Affiliations:
Microsoft Research, Redmond 98052
Venue:
ICTIR '09 Proceedings of the 2nd International Conference on Theory of Information Retrieval: Advances in Information Retrieval Theory
Year:
2009

Citing 7
Cited 1

Exploiting generative models in discriminative classifiers

Proceedings of the 1998 conference on Advances in neural information processing systems II
The Canonical Distortion Measure for Vector Quantization and Function Approximation

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
The Leave-One-Out Kernel

ICANN '02 Proceedings of the International Conference on Artificial Neural Networks
Probability Product Kernels

The Journal of Machine Learning Research
A generative theory of relevance

A generative theory of relevance
A web-based kernel function for measuring the similarity of short text snippets

Proceedings of the 15th international conference on World Wide Web
A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data

The Journal of Machine Learning Research

Measuring the variability in effectiveness of a retrieval system

IRFC'10 Proceedings of the First international Information Retrieval Facility conference on Adbances in Multidisciplinary Retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

We introduce perturbation kernels , a new class of similarity measure for information retrieval that casts word similarity in terms of multi-task learning. Perturbation kernels model uncertainty in the user's query by choosing a small number of variations in the relative weights of the query terms to build a more complete picture of the query context, which is then used to compute a form of expected distance between words. Our approach has a principled mathematical foundation, a simple analytical form, and makes few assumptions about the underlying retrieval model, making it easy to apply in a broad family of existing query expansion and model estimation algorithms.