Consistent phrase relevance measures

  • Authors:
  • Wen-tau Yih;Christopher Meek

  • Affiliations:
  • Microsoft Research, One Microsoft Way, Redmond, WA;Microsoft Research, One Microsoft Way, Redmond, WA

  • Venue:
  • Proceedings of the 2nd International Workshop on Data Mining and Audience Intelligence for Advertising
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Measuring the relevance between a document and a phrase is fundamental to many information retrieval and matching tasks including on-line advertising. In this paper, we explore two approaches for measuring the relevance between a document and a phrase aiming to provide consistent relevance scores for both in and out-of document phrases. The first approach is a similarity-based method which represents both the document and phrase as term vectors to derive a real-valued relevance score. The second approach takes as input the relevance estimates of some in-document phrases and uses Gaussian Process Regression to predict the score of a target out-of-document phrase. While both of these two approaches work well, the best result is given by a Gaussian Process Regression model, which is significantly better than the similarity-based approach and 10% better than a baseline similarity method using bag-of-word vectors.