A quadratic lower bound for rocchio's similarity-based relevance feedback algorithm

  • Authors:
  • Zhixiang Chen;Bin Fu

  • Affiliations:
  • Department of Computer Science, University of Texas-Pan American, Edinburg, TX;Department of Computer Science, University of New Orleans, New Orleans, LA

  • Venue:
  • COCOON'05 Proceedings of the 11th annual international conference on Computing and Combinatorics
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

It is shown in [4] that Rocchio’s similarity-based relevance feedback algorithm makes Ω(n) mistakes in searching for a collection of documents represented by a monotone disjunction of at most k relevant features (or terms) over the n-dimensional binary vector space {0, 1}n. In practice, Rocchio’s algorithm often uses a fixed query updating factor and a fixed classification threshold. When this is the case, we strengthen the work in [4] in this paper and prove that Rocchio’s algorithm makes Ω(k(n–k)) mistakes in searching for the same collection of documents over the binary vector space {0, 1}n. A quadratic lower bound is obtained when k is proportional to n. An O(k(n–k)2) upper bound is also obtained.