Cluster-based language models for distributed retrieval
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
A language modeling framework for resource selection and results merging
Proceedings of the eleventh international conference on Information and knowledge management
Comparing the performance of collection selection algorithms
ACM Transactions on Information Systems (TOIS)
The Journal of Machine Learning Research
Distinctive Image Features from Scale-Invariant Keypoints
International Journal of Computer Vision
Scene Classification Using a Hybrid Generative/Discriminative Approach
IEEE Transactions on Pattern Analysis and Machine Intelligence
Central-rank-based collection selection in uncooperative distributed information retrieval
ECIR'07 Proceedings of the 29th European conference on IR research
Document allocation policies for selective searching of distributed indexes
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Geometric Latent Dirichlet Allocation on a Matching Graph for Large-scale Image Datasets
International Journal of Computer Vision
Hi-index | 0.00 |
To improve query throughput, distributed image retrieval has been widely used to address the large scale visual search. In textual retrieval, the state-of-the-art approaches attempt to partition a textual database into multiple collections offline and allocate each collection to a server node. For each incoming query, just a few relevant collections are selected to search without seriously sacrificing retrieval accuracy, which enables sever nodes to process multiple queries concurrently. Unlike text retrieval, distributed visual search poses challenges in optimally allocating images and selecting image collections, due to the lack of semantic meanings in Bag of Words (BoW) based representation. In this paper, we propose a novel Semantics Related Distributed Visual Search (SRDVS) model. We employ Latent Dirichlet Allocation (LDA) [2] to discover the latent concepts as an intermediate semantic representation over a large scale image database. We aim to learn an optimal image allocation for each server node and accurately perform collection selection for each query. Experimental results over a million scale image database have demonstrated encouraging performance over state-of-the-art approaches. On average 6% collections are selected, which however yields promising retrieval performance comparable to the exhaustive search over the whole database.