K-d trees for semidynamic point sets
SCG '90 Proceedings of the sixth annual symposium on Computational geometry
Approximate nearest neighbors: towards removing the curse of dimensionality
STOC '98 Proceedings of the thirtieth annual ACM symposium on Theory of computing
Improved boosting algorithms using confidence-rated predictions
COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
An introduction to variable and feature selection
The Journal of Machine Learning Research
International Journal of Computer Vision
Fast Asymmetric Learning for Cascade Face Detection
IEEE Transactions on Pattern Analysis and Machine Intelligence
Proceedings of the international conference on Multimedia information retrieval
Representations of Keypoint-Based Semantic Concept Detection: A Comprehensive Study
IEEE Transactions on Multimedia
Hi-index | 0.00 |
AdaBoost has been proved a successful statistical learning method for concept detection with high performance of discrimination and generalization. However, it is computationally expensive to train a concept detector using boosting, especially on large scale datasets. The bottleneck of training phase is to select the best learner among massive learners. Traditional approaches for selecting a weak classifier usually run in O(NT), with N examples and T learners. In this paper, we treat the best learner selection as a Nearest Neighbor Search problem in the function space instead of feature space. With the help of Locality Sensitive Hashing (LSH) algorithm, the best learner searching procedure can be speeded up in the time of O(NL), where L is the number of buckets in LSH. Compared with the T (~500,000), the L (~600) is much smaller in our experiments. In addition, through studying the distribution of weak learners and candidate query points, we present an efficient method to try to partition the weak learner points and the feasible region of query points uniformly as much as possible, which can achieve significant improvement in both recall and precision compared with the random projection in traditional LSH algorithm. Experimental results reveal our method can significantly reduce the training time. And still the performance of our method is comparable with the state-of-art methods.