Is the unigram relevance model term independent?: classifying term dependencies in query expansion

  • Authors:
  • Mike Symonds;Peter Bruza;Guido Zuccon;Laurianne Sitbon;Ian Turner

  • Affiliations:
  • Queensland University of Technology, Brisbane, Australia;Queensland University of Technology, Brisbane, Australia;Australian e-Health Research Centre, CSIRO, Brisbane, Australia;Queensland University of Technology, Brisbane, Australia;Queensland University of Technology, Brisbane, Australia

  • Venue:
  • Proceedings of the Seventeenth Australasian Document Computing Symposium
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper develops a framework for classifying term dependencies in query expansion with respect to the role terms play in structural linguistic associations. The framework is used to classify and compare the query expansion terms produced by the unigram and positional relevance models. As the unigram relevance model does not explicitly model term dependencies in its estimation process it is often thought to ignore dependencies that exist between words in natural language. The framework presented in this paper is underpinned by two types of linguistic association, namely syntagmatic and paradigmatic associations. It was found that syntagmatic associations were a more prevalent form of linguistic association used in query expansion. Paradoxically, it was the unigram model that exhibited this association more than the positional relevance model. This surprising finding has two potential implications for information retrieval models: (1) if linguistic associations underpin query expansion, then a probabilistic term dependence assumption based on position is inadequate for capturing them; (2) the unigram relevance model captures more term dependency information than its underlying theoretical model suggests, so its normative position as a baseline that ignores term dependencies should perhaps be reviewed.