RASIM: a rank-aware separate index method for answering top-k spatial keyword queries

  • Authors:
  • Hyuk-Yoon Kwon;Kyu-Young Whang;Il-Yeol Song;Haixun Wang

  • Affiliations:
  • Department of Computer Science, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, South Korea;Department of Computer Science, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, South Korea;College of Information Science and Technology, Drexel University, Philadelphia, USA;Microsoft Research Asia, Beijing, China

  • Venue:
  • World Wide Web
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

A top-k spatial keyword query returns k objects having the highest (or lowest) scores with regard to spatial proximity as well as text relevancy. Approaches for answering top-k spatial keyword queries can be classified into two categories: the separate index approach and the hybrid index approach. The separate index approach maintains the spatial index and the text index independently and can accommodate new data types. However, it is difficult to support top-k pruning and merging efficiently at the same time since it requires two different orders for clustering the objects: the first based on scores for top-k pruning and the second based on object IDs for efficient merging. In this paper, we propose a new separate index method called Rank-Aware Separate Index Method (RASIM) for top-k spatial keyword queries. RASIM supports both top-k pruning and efficient merging at the same time by clustering each separate index in two different orders through the partitioning technique. Specifically, RASIM partitions the set of objects in each index into rank-aware (RA) groups that contain the objects with similar scores and applies the first order to these groups according to their scores and the second order to the objects within each group according to their object IDs. Based on the RA groups, we propose two query processing algorithms: (i) External Threshold Algorithm (External TA) that supports top-k pruning in the unit of RA groups and (ii) Generalized External TA that enhances the performance of External TA by exploiting special properties of the RA groups. RASIM is the first research work that supports top-k pruning based on the separate index approach. Naturally, it keeps the advantages of the separate index approach. In addition, in terms of storage and query processing time, RASIM is more efficient than the IR-tree method, which is the prevailing method to support top-k pruning to date and is based on the hybrid index approach. Experimental results show that, compared with the IR-tree method, the index size of RASIM is reduced by up to 1.85 times, and the query performance is improved by up to 3.22 times.