Data management of wireless sensor networks
CCNC'09 Proceedings of the 6th IEEE Conference on Consumer Communications and Networking Conference
Hi-index | 0.01 |
Locality Sensitive Hashing (LSH) is a method of performing probabilistic dimension reduction of high dimensional data. It is a popular technique for approximate nearest neighbor search. However, LSH needs large memory space and long processing time to achieve good performance when searching a massive dataset. In addition, it is not effective on locating similar data in a very high dimensional dataset. This paper proposes a new LSH-based similarity searching scheme, namely SMLSH. It intelligently combines a consistent hash function and min-wise independent permutations into LSH. SMLSH effectively classifies information according to the similarity with reduced memory space requirement and in a very efficient manner. It can quickly locate similar data in a massive dataset. Experiment results show that SMLSH is both time and space efficient in comparison with LSH. It yields significant improvements on the effectiveness of similar searching over LSH in a massive dataset.