Succinct de bruijn graphs

  • Authors:
  • Alexander Bowe;Taku Onodera;Kunihiko Sadakane;Tetsuo Shibuya

  • Affiliations:
  • National Institute of Informatics, Chiyoda-ku, Tokyo, Japan;Human Genome Center, Institute of Medical Science, University of Tokyo, Minato-ku, Tokyo, Japan;National Institute of Informatics, Chiyoda-ku, Tokyo, Japan;Human Genome Center, Institute of Medical Science, University of Tokyo, Minato-ku, Tokyo, Japan

  • Venue:
  • WABI'12 Proceedings of the 12th international conference on Algorithms in Bioinformatics
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

We propose a new succinct de Bruijn graph representation. If the de Bruijn graph of k-mers in a DNA sequence of length N has m edges, it can be represented in 4m+o(m) bits. This is much smaller than existing ones. The numbers of outgoing and incoming edges of a node are computed in constant time, and the outgoing and incoming edge with given label are found in constant time and $\mathcal{O}(k)$ time, respectively. The data structure is constructed in $\mathcal{O}(Nk \log m/\log\log m)$ time using no additional space.