Storage and Querying of High Dimensional Sparsely Populated Data in Compressed Representation

  • Authors:
  • Abu Sayed Md. Latiful Hoque

  • Affiliations:
  • -

  • Venue:
  • EurAsia-ICT '02 Proceedings of the First EurAsian Conference on Information and Communication Technology
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

Storage and querying of high dimensional sparsely populated data creates new challenge to conventional horizontal model. It requires supporting large number of columns and frequently altering of database schema. The sparsity of data degrades performance in both time and space. A 3-ary vertical representation [5] can be used. But the cardinality of the vertical table grows exponentially when the density of the non-null values increases. It is also difficult to support multiple data types usinga single vertical table. In this paper, we have presented a compressed 1-ary vertical representation where schema evolution is easy and size grows linearly with non-null density. Queries can be processed on compressed form of data without decompression. Decompression is done only when the result is necessary. We have considered three alternative representations: 3-ary uncompressed vertical, 1-ary compressed bit-array and 1-ary compressed offset. Experimental results show the superiority of 1-ary offset representation in both space and time.