Accelerating the neighbor-joining algorithm using the adaptive bucket data structure

  • Authors:
  • Leonid Zaslavsky;Tatiana A. Tatusova

  • Affiliations:
  • National Center for Biotechnology Information, National Library of Medicine, National Institute of Health, Bethesda, MD;National Center for Biotechnology Information, National Library of Medicine, National Institute of Health, Bethesda, MD

  • Venue:
  • ISBRA'08 Proceedings of the 4th international conference on Bioinformatics research and applications
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

The complexity of the neighbor joining method is determinedby the complexity of the search for an optimal pair ("neighbors tojoin") performed globally at each iteration. Accelerating the neighbor-joining method requires performing a smarter search for an optimal pairof neighbors, avoiding re-evaluation of all possible pairs of points at eachiteration. We developed an acceleration technique for the neighbor-joining method that significantly decreases complexity for important applicationswithout any change in the neighbor-joining method. This techniqueutilizes the bucket data structure. The pairs of nodes are arranged inbuckets according to values of the goal function δij = ui+uj-dij. Bucketsare adaptively re-arranged after each neighbor-joining step. While thepairs of nodes in the top bucket are re-evaluated at every iteration, pairsin lower buckets are accessed more rarely, when the algorithm determinesthat the elements of the bucket need to be re-evaluated based on newvalues of δij. As a result, only a small portion of candidate pairs of nodesis examined at each iteration. The algorithm is cache efficient, since the bucket data structures areable to exploit locality and adjust to cache properties.