Linear structure in information retrieval
SIGIR '88 Proceedings of the 11th annual international ACM SIGIR conference on Research and development in information retrieval
Automatic text processing: the transformation, analysis, and retrieval of information by computer
Automatic text processing: the transformation, analysis, and retrieval of information by computer
Information retrieval: data structures and algorithms
Information retrieval: data structures and algorithms
How fast can a threshold gate learn?
Proceedings of a workshop on Computational learning theory and natural learning systems (vol. 1) : constraints and prospects: constraints and prospects
On the learnability of Zn-DNF formulas (extended abstract)
COLT '95 Proceedings of the eighth annual conference on Computational learning theory
Learning counting functions with queries
Theoretical Computer Science
Artificial Intelligence - Special issue on relevance
Latent semantic indexing: a probabilistic analysis
Journal of Computer and System Sciences - Special issue on the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on principles of database systems
A vector space model for automatic indexing
Communications of the ACM
Information Retrieval
Modern Information Retrieval
Some Formal Analysis of Roccio's Similarity-Based Relvance Feedback Algorithm
ISAAC '00 Proceedings of the 11th International Conference on Algorithms and Computation
Multiplicative Adaptive Algorithms for User Preference Retrieval
COCOON '01 Proceedings of the 7th Annual International Conference on Computing and Combinatorics
A quadratic lower bound for rocchio's similarity-based relevance feedback algorithm
COCOON'05 Proceedings of the 11th annual international conference on Computing and Combinatorics
Hi-index | 0.00 |
In this paper, we prove for the first time that the learning complexity of Rocchio's algorithm is O(d + d2(log d + log n)) over the discretized vector space {0,...,n–1}d, when the inner product similarity measure is used. The upper bound on the learning complexity for searching for documents represented by a monotone linear classifier (q,0) over {0,...,n–1}d can be improved to O(d + 2k(n–1)(log d + log(n–1))), where k is the number of nonzero components in q. An Ω((d2)log n) lower bound on the learning complexity is also obtained for Rocchio's algorithm over {0,...,n–1}d. In practice, Rocchio's algorithm often uses fixed query updating factors. When this is the case, the lower bound is strengthened to 2$^{{\it \Omega}(d)}$ over the binary vector space {0,1}d. In general, if the query updating factors are bounded by O(nc) for some constant c≥ 0, an ${\it \Omega}(n^{d-1-c}/(n-1))$ lower bound is obtained over {0,...,n–1}d.