Document relevance assessment via term distribution analysis using fourier series expansion

  • Authors:
  • Patricio Galeas;Ralph Kretschmer;Bernd Freisleben

  • Affiliations:
  • University of Marburg, Marburg, Germany;Kretschmer Software, Siegen, Germany;University of Marburg, Marburg, Germany

  • Venue:
  • Proceedings of the 9th ACM/IEEE-CS joint conference on Digital libraries
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

In addition to the frequency of terms in a document collection, the distribution of terms plays an important role in determining the relevance of documents for a given search query. In this paper, term distribution analysis using Fourier series expansion as a novel approach for calculating an abstract representation of term positions in a document corpus is introduced. Based on this approach, two methods for improving the evaluation of document relevance are proposed: (a) a function-based ranking optimization representing a user defined document region, and (b) a query expansion technique based on overlapping the term distributions in the top-ranked documents. Experimental results demonstrate the effectiveness of the proposed approach in providing new possibilities for optimizing the retrieval process.