CCIndex: a complemental clustering index on distributed ordered tables for multi-dimensional range queries

  • Authors:
  • Yongqiang Zou;Jia Liu;Shicai Wang;Li Zha;Zhiwei Xu

  • Affiliations:
  • Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China;Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China;Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China;Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China;Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China

  • Venue:
  • NPC'10 Proceedings of the 2010 IFIP international conference on Network and parallel computing
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Massive scale distributed database like Google's BigTable and Yahoo!' s PNUTS can be modeled as Distributed Ordered Table, or DOT, which partitions data regions and supports range queries on key. Multi-dimensional range queries on DOTs are fundamental requirements; however, none of existing schemes work well while considering three critical issues: high performance, low space overhead, and high reliability. This paper introduces CCIndex scheme, short for Complemental Clustering Index, to solve all three issues. CCIndex creates several Complemental Clustering Index Tables for performance, leverages region-to-server information to estimate result size, and supports incremental data recovery. This paper builds a prototype on Apache HBase. Theoretical analysis and micro-benchmarks show that CCIndex consumes 5.3% ∼ 29.3% more space, has the same reliability, and gains 11.4 times range queries throughput of secondary index scheme. Synthetic application benchmark shows that CCIndex query throughput is 1.9 ∼ 2.1 times of MySQL Cluster.