On the complexity of rocchio's similarity-based relevance feedback algorithm

Authors:
Zhixiang Chen;Bin Fu
Affiliations:
Department of Computer Science, University of Texas-Pan American, Edinburg, TX;,Department of Computer Science, University of New Orleans, New Orleans, LA
Venue:
ISAAC'05 Proceedings of the 16th international conference on Algorithms and Computation
Year:
2005

Citing 15
Cited 0

Linear structure in information retrieval

SIGIR '88 Proceedings of the 11th annual international ACM SIGIR conference on Research and development in information retrieval
Automatic text processing: the transformation, analysis, and retrieval of information by computer

Automatic text processing: the transformation, analysis, and retrieval of information by computer
Information retrieval: data structures and algorithms

Information retrieval: data structures and algorithms
How fast can a threshold gate learn?

Proceedings of a workshop on Computational learning theory and natural learning systems (vol. 1) : constraints and prospects: constraints and prospects
On the learnability of Zn-DNF formulas (extended abstract)

COLT '95 Proceedings of the eighth annual conference on Computational learning theory
Learning counting functions with queries

Theoretical Computer Science
The Perceptron algorithm versus Winnow: linear versus logarithmic mistake bounds when few input variables are relevant

Artificial Intelligence - Special issue on relevance
Latent semantic indexing: a probabilistic analysis

Journal of Computer and System Sciences - Special issue on the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on principles of database systems
A vector space model for automatic indexing

Communications of the ACM
Information Retrieval

Information Retrieval
Modern Information Retrieval

Modern Information Retrieval
Learning Quickly When Irrelevant Attributes Abound: A New Linear-Threshold Algorithm

Machine Learning
Some Formal Analysis of Roccio's Similarity-Based Relvance Feedback Algorithm

ISAAC '00 Proceedings of the 11th International Conference on Algorithms and Computation
Multiplicative Adaptive Algorithms for User Preference Retrieval

COCOON '01 Proceedings of the 7th Annual International Conference on Computing and Combinatorics
A quadratic lower bound for rocchio's similarity-based relevance feedback algorithm

COCOON'05 Proceedings of the 11th annual international conference on Computing and Combinatorics

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we prove for the first time that the learning complexity of Rocchio's algorithm is O(d + d2(log d + log n)) over the discretized vector space {0,...,n–1}d, when the inner product similarity measure is used. The upper bound on the learning complexity for searching for documents represented by a monotone linear classifier (q,0) over {0,...,n–1}d can be improved to O(d + 2k(n–1)(log d + log(n–1))), where k is the number of nonzero components in q. An Ω((d2)log n) lower bound on the learning complexity is also obtained for Rocchio's algorithm over {0,...,n–1}d. In practice, Rocchio's algorithm often uses fixed query updating factors. When this is the case, the lower bound is strengthened to 2$^{{\it \Omega}(d)}$ over the binary vector space {0,1}d. In general, if the query updating factors are bounded by O(nc) for some constant c≥ 0, an ${\it \Omega}(n^{d-1-c}/(n-1))$ lower bound is obtained over {0,...,n–1}d.