Efficient distributed multi-dimensional index for big data management

  • Authors:
  • Xin Zhou;Xiao Zhang;Yanhao Wang;Rui Li;Shan Wang

  • Affiliations:
  • School of Information, Renmin University of China, Beijing, China;School of Information, Renmin University of China, Beijing, China;School of Information, Renmin University of China, Beijing, China;School of Information, Renmin University of China, Beijing, China;School of Information, Renmin University of China, Beijing, China

  • Venue:
  • WAIM'13 Proceedings of the 14th international conference on Web-Age Information Management
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

With the advent of the era for big data, demands of various applications equipped with distributed multi-dimensional indexes become increasingly significant and indispensable. To cope with growing demands, numerous researchers demonstrate interests in this domain. Obviously, designing an efficient, scalable and flexible distributed multi-dimensional index has been confronted with new challenges. Therefore, we present a brand-new distributed multi-dimensional index method--EDMI. In detail, EDMI has two layers: the global layer employs K-d tree to partition entire space into many subspaces and the local layer contains a group of Z-order prefix R-trees related to one subspace respectively. Z-order prefix R-Tree (ZPR-tree) is a new variant of R-tree leveraging Z-order prefix to avoid the overlap of MBRs for R-tree nodes with multi-dimensional point data. In addition, ZPR-tree has the equivalent construction speed of Packed R-trees and obtains better query performance than other Packed R-trees and R*-tree. EDMI efficiently supports many kinds of multi-dimensional queries. We experimentally evaluated prototype implementation for EDMI based on HBase. Experimental results reveal that EDMI has better performance on point, range and KNN query than state-of-art indexing techniques based on HBase. Moreover, we verify that Z-order prefix R-Tree gets better overall performance than other R-Tree variants through further experiments. In general, EDMI serves as an efficient, scalable and flexible distributed multi-dimensional index framework.