Efficiency of prefix and non-prefix codes in string matching over compressed databases on handheld devices

  • Authors:
  • Abdelghani Bellaachia;Iehab AL Rassan

  • Affiliations:
  • The George Washington University, Washington DC;The George Washington University, Washington DC

  • Venue:
  • Proceedings of the 2005 ACM symposium on Applied computing
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper shows the efficiency of prefix and non-prefix codes for searching over compressed handheld databases. Byte Pair Encoding (BPE), Tagged Suboptimal Code (TSC), and Huffman encoding are the compression techniques used in the evaluation. By compressing handheld databases and searching over compressed text without needing to expand the databases, more data will be stored and more applications can be used. Experimental results show that about 33% more space has been achieved in the compressed handhelds' databases when using Searching over Compressed Text using BPE (SCTB) or Searching over Compressed Text using TSC (SCTT) solutions. Moreover, both solutions are 6.6 times faster than decompressing the databases followed by a linear search in all different sizes of databases. Efficiency performance shows that SCTB is the recommended solution for databases consisting of large-sized records and rarely updated, and SCTT is the recommended method for frequently updated databases or consisting of small-sized records. TSC and BPE compression schemes could also be used to accelerate wireless connectivity, web clipping, or databases transfer between handheld devices and computers, since these databases are usually small in size.