Composite hashing with multiple information sources

Authors:
Dan Zhang;Fei Wang;Luo Si
Affiliations:
Purdue University, West Lafayette, IN, USA;IBM T. J. Watson Research Lab, Hawthorne, NY, USA;Purdue University, West Lafayette, IN, USA
Venue:
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Year:
2011

Citing 34
Cited 12

Term-weighting approaches in automatic text retrieval

Information Processing and Management: an International Journal
The nature of statistical learning theory

The nature of statistical learning theory
Probabilistic latent semantic indexing

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Managing gigabytes (2nd ed.): compressing and indexing documents and images

Managing gigabytes (2nd ed.): compressing and indexing documents and images
Normalized Cuts and Image Segmentation

IEEE Transactions on Pattern Analysis and Machine Intelligence
Modern Information Retrieval

Modern Information Retrieval
Introduction to Algorithms

Introduction to Algorithms
Automating the Construction of Internet Portals with Machine Learning

Information Retrieval
Composite Kernels for Hypertext Categorisation

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Similarity Search in High Dimensions via Hashing

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
On the Resemblance and Containment of Documents

SEQUENCES '97 Proceedings of the Compression and Complexity of Sequences 1997
Object Recognition from Local Scale-Invariant Features

ICCV '99 Proceedings of the International Conference on Computer Vision-Volume 2 - Volume 2
Fast Pose Estimation with Parameter-Sensitive Hashing

ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Convex Optimization

Convex Optimization
Locality-sensitive hashing scheme based on p-stable distributions

SCG '04 Proceedings of the twentieth annual symposium on Computational geometry
RCV1: A New Benchmark Collection for Text Categorization Research

The Journal of Machine Learning Research
Learning Eigenfunctions Links Spectral Embedding and Kernel PCA

Neural Computation
Inverted files for text search engines

ACM Computing Surveys (CSUR)
Linear prediction models with graph regularization for web-page categorization

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
AnnoSearch: Image Auto-Annotation by Search

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
A fast learning algorithm for deep belief nets

Neural Computation
Pattern Recognition and Machine Learning (Information Science and Statistics)

Pattern Recognition and Machine Learning (Information Science and Statistics)
Near-Optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimensions

FOCS '06 Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science
Principles of hash-based text retrieval

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Introduction to Information Retrieval

Introduction to Information Retrieval
Learning to hash: forgiving hash functions and applications

Data Mining and Knowledge Discovery
Semantic hashing

International Journal of Approximate Reasoning
Brute force and indexed approaches to pairwise document similarity comparisons with MapReduce

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Multi-view local learning

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2
Self-taught hashing for fast similarity search

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
VSEncoding: efficient coding and fast decoding of integer lists via dynamic programming

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Efficient set intersection for inverted indexing

ACM Transactions on Information Systems (TOIS)
Laplacian co-hashing of terms and documents

ECIR'2010 Proceedings of the 32nd European conference on Advances in Information Retrieval

A probabilistic model for multimodal hash function learning

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Manhattan hashing for large-scale image retrieval

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Submodular video hashing: a unified framework towards video pooling and indexing

Proceedings of the 20th ACM international conference on Multimedia
Compact kernel hashing with multiple features

Proceedings of the 20th ACM international conference on Multimedia
Sequential spectral learning to hash with multiple representations

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part V
Hashing with cauchy graph

PCM'12 Proceedings of the 13th Pacific-Rim conference on Advances in Multimedia Information Processing
Sparse hashing for fast multimedia search

ACM Transactions on Information Systems (TOIS)
Semantic hashing using tags and topic modeling

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Linear cross-modal hashing for efficient multimedia search

Proceedings of the 21st ACM international conference on Multimedia
Multi-modal distance metric learning

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Mixed image-keyword query adaptive hashing over multilabel images

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
Multiple feature kernel hashing for large-scale visual search

Pattern Recognition

Quantified Score

Hi-index	0.00

Visualization

Abstract

Similarity search applications with a large amount of text and image data demands an efficient and effective solution. One useful strategy is to represent the examples in databases as compact binary codes through semantic hashing, which has attracted much attention due to its fast query/search speed and drastically reduced storage requirement. All of the current semantic hashing methods only deal with the case when each example is represented by one type of features. However, examples are often described from several different information sources in many real world applications. For example, the characteristics of a webpage can be derived from both its content part and its associated links. To address the problem of learning good hashing codes in this scenario, we propose a novel research problem -- Composite Hashing with Multiple Information Sources (CHMIS). The focus of the new research problem is to design an algorithm for incorporating the features from different information sources into the binary hashing codes efficiently and effectively. In particular, we propose an algorithm CHMIS-AW (CHMIS with Adjusted Weights) for learning the codes. The proposed algorithm integrates information from several different sources into the binary hashing codes by adjusting the weights on each individual source for maximizing the coding performance, and enables fast conversion from query examples to their binary hashing codes. Experimental results on five different datasets demonstrate the superior performance of the proposed method against several other state-of-the-art semantic hashing techniques.