Linear structure in information retrieval
SIGIR '88 Proceedings of the 11th annual international ACM SIGIR conference on Research and development in information retrieval
Automatic text processing: the transformation, analysis, and retrieval of information by computer
Automatic text processing: the transformation, analysis, and retrieval of information by computer
Information retrieval: data structures and algorithms
Information retrieval: data structures and algorithms
How fast can a threshold gate learn?
Proceedings of a workshop on Computational learning theory and natural learning systems (vol. 1) : constraints and prospects: constraints and prospects
On the learnability of Zn-DNF formulas (extended abstract)
COLT '95 Proceedings of the eighth annual conference on Computational learning theory
Artificial Intelligence - Special issue on relevance
Latent semantic indexing: a probabilistic analysis
Journal of Computer and System Sciences - Special issue on the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on principles of database systems
A vector space model for automatic indexing
Communications of the ACM
Information Retrieval
Modern Information Retrieval
Some Formal Analysis of Rocchio's Similarity-Based Relevance Feedback Algorithm
Information Retrieval
A quadratic lower bound for rocchio's similarity-based relevance feedback algorithm
COCOON'05 Proceedings of the 11th annual international conference on Computing and Combinatorics
Hi-index | 0.00 |
Rocchio's similarity-based relevance feedback algorithm, one ofthe most important query reformation methods in informationretrieval, is essentially an adaptive learning algorithm fromexamples in searching for documents represented by a linearclassifier. Despite its popularity in various applications, thereis little rigorous analysis of its learning complexity inliterature. In this article, the authors prove for the first timethat the learning complexity of Rocchio's algorithm isO(d + d2(logd + log n)) over the discretized vectorspace {0,…, n - 1}d,when the inner product similarity measure is used. The upper boundon the learning complexity for searching for documents representedby a monotone linear classifier $\left( {\overrightarrow q ,0}\right)$ over {0,…, n -1}d can be improved to, at most, 1 +2k (n - 1) (log d -log(n - 1)), where k is the number ofnonzero components in q. Several lower bounds on thelearning complexity are also obtained for Rocchio's algorithm. Forexample, the authors prove that Rocchio's algorithm has a lowerbound $\Omega \left( {\left( {_2^d } \right){\rm{log}}\,n} \right)$on its learning complexity over the Boolean vector space {0,1}d. © 2007 Wiley Periodicals,Inc.