Assessing the impact of thesaurus-based expansion techniques in QA-centric IR

  • Authors:
  • Luís Sarmento;Jorge Teixeira;Eugénio Oliveira

  • Affiliations:
  • Faculdade de Engenharia da Universidade do Porto, Laboratorio de Inteligência Artificial e Ciências de Computadores, Porto, Portugal;Faculdade de Engenharia da Universidade do Porto, Laboratorio de Inteligência Artificial e Ciências de Computadores, Porto, Portugal;Faculdade de Engenharia da Universidade do Porto, Laboratorio de Inteligência Artificial e Ciências de Computadores, Porto, Portugal

  • Venue:
  • CLEF'08 Proceedings of the 9th Cross-language evaluation forum conference on Evaluating systems for multilingual and multimodal information access
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

We study the impact of using thesaurus-based query expansion methods at the Information Retrieval (IR) stage of a Question Answering (QA) system. We focus on expanding queries for questions regarding actions and events, where verbs have a central role. Two different thesaurus are used: the OpenOffice thesaurus and an automatically generated verb thesaurus. The performance of thesaurus-based methods is compared against what is obtained by (i) executing no expansion and (ii) applying a simple query generalization method. Results show that thesaurus-based approaches help improving recall at retrieval, while keeping satisfactory precision. However, we confirm that positive impact for the final QA performance is mostly achieved due to increase in recall, which can also be obtained by using simpler methods. Nevertheless, because of its better relative precision thesaurus-based expansion is effective in selectively reducing the number of irrelevant text passages retrieved, thus reducing computational load in the answer extraction stage.