Constructing large suffix trees on a computational grid

  • Authors:
  • Chunxi Chen;Bertil Schmidt

  • Affiliations:
  • School of Computer Engineering, Nanyang Technological University, Singapore;School of Computer Engineering, Nanyang Technological University, Singapore

  • Venue:
  • Journal of Parallel and Distributed Computing
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

The suffix tree is a key data structure for biological sequence analysis, since it permits efficient solutions to many string-based problems. Constructing large suffix trees is challenging because of high memory overheads and poor memory locality. Even though efficient suffix tree construction algorithms exist, their run-time is still very high for long DNA sequences such as whole human chromosomes. In this paper, we are using a hierarchical grid system as a computational platform in order to reduce this run-time significantly. To achieve an efficient mapping onto this type of architecture we introduce a parallel suffix tree construction algorithm that makes use of a new data structure called the common prefix suffix tree. Using this algorithm together with a dynamic load balancing strategy we show that our distributed grid implementation leads to significant run-time savings.