Improving the efficiency of the Bayesian network retrieval model by reducing relationships between terms

  • Authors:
  • Luis M. de Campos;Juan M. Fernández-Luna;Juan F. Huete

  • Affiliations:
  • Departamento de Ciencias de la Computación e Inteligencia Artificial, E.T.S.I. Informática. Universidad de Granada, 18071, Granada. Spain;Departamento de Ciencias de la Computación e Inteligencia Artificial, E.T.S.I. Informática. Universidad de Granada, 18071, Granada. Spain;Departamento de Ciencias de la Computación e Inteligencia Artificial, E.T.S.I. Informática. Universidad de Granada, 18071, Granada. Spain

  • Venue:
  • International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems - Intelligent information systems
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

The Bayesian Network Retrieval Model is able to represent the main (in)dependence relationships between the terms from a document collection by means of a specific type of Bayesian network, namely a polytree. However, although the learning and propagation algorithms designed for this topology are very efficient, in collections with a very large number of terms, these two tasks might be very time-consuming. This paper shows how by reducing the size of the polytree, which will only comprise one subset of terms which are selected according to their retrieval quality, the performance of the model is maintained, whereas the efforts needed to learn and later propagate in the model are considerably reduced. A method for selecting the best terms, based on their inverse document frequency and term discrimination value, is also presented.