The art of computer programming, volume 1 (3rd ed.): fundamental algorithms
The art of computer programming, volume 1 (3rd ed.): fundamental algorithms
An optimal algorithm for approximate nearest neighbor searching fixed dimensions
Journal of the ACM (JACM)
Normalized Cuts and Image Segmentation
IEEE Transactions on Pattern Analysis and Machine Intelligence
A technique for counting ones in a binary computer
Communications of the ACM
Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms
Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms
Fast Pose Estimation with Parameter-Sensitive Hashing
ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Near-Optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimensions
FOCS '06 Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science
Strategies for retrieving plagiarized documents
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Scalable near identical image and shot detection
Proceedings of the 6th ACM international conference on Image and video retrieval
Scene completion using millions of photographs
Communications of the ACM
International Journal of Approximate Reasoning
Solving multiclass learning problems via error-correcting output codes
Journal of Artificial Intelligence Research
Self-taught hashing for fast similarity search
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Hi-index | 0.00 |
Fast similarity search is one of the key techniques in many large scale learning and data mining applications. Recently, hashing-based methods, which create compact and efficient codes that preserve data distribution, have received considerable attention due to their promising theoretical and empirical results. An ideal hashing method 1) can naturally have out-of-sample extension; 2) has very low computational complexity; and 3) has significant improvement over linear search in the original space in terms of accuracy. However, most existing hashing methods failed to satisfy all the above three requirements. In this paper, we propose a new method called Error-correcting Output Hashing (ECOH) which meets all the above three requirements. ECOH first groups all the samples into clusters using a conventional clustering algorithm. Each cluster is assigned an Error-Correcting Output Code (ECOC) and the linear mappings from the sample vectors to the ECOC are then learned using linear regression models. In this way, ECOH learns both the binary code for each sample and the function which links the input vector and the output code. Experimental results on real world data sets demonstrate the effectiveness and efficiency of the proposed approach.