Distributed suffix trees and their application to large-scale genomic analysis

  • Authors:
  • Raphaël Clifford;Marek Sergot

  • Affiliations:
  • Department of Computing, Imperial College, London;Department of Computing, Imperial College, London

  • Venue:
  • ICCMSE '03 Proceedings of the international conference on Computational methods in sciences and engineering
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

We have recently presented a variant of the suffix tree which allows much larger genome sequence databases to be analysed efficiently. The new data structure, termed the distributed suffix tree (DST), is designed for distributed memory parallel computing environments (e.g. Beowulf clusters). It tackles the memory bottleneck by constructing subtrees of the full suffix tree independently. The standard operations on suffix trees of biological importance are easily translatable to this new data structure. While none of these operations on the DST require inter-process communication, many have optimal expected parallel running times.