The BNB distribution for text modeling

  • Authors:
  • Stéphane Clinchant;Eric Gaussier

  • Affiliations:
  • Xerox Research Centre Europe, Meylan, France;University Joseph Fourier, LIG, Grenoble cedex 9, France

  • Venue:
  • ECIR'08 Proceedings of the IR research, 30th European conference on Advances in information retrieval
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

We first review in this paper the burstiness and aftereffect of future sampling phenomena, and propose a formal, operational criterion to characterize distributions according to these phenomena. We then introduce the Beta negative binomial distribution for text modeling, and show its relations to several models (in particular to the Laplace law of succession and to the tf-itf model used in the Divergence from Randomness framework of [2]). We finally illustrate the behavior of this distribution on text categorization and information retrieval experiments.