Unified view of backward backtracking in short read mapping

  • Authors:
  • Veli Mäkinen;Niko Välimäki;Antti Laaksonen;Riku Katainen

  • Affiliations:
  • Department of Computer Science, University of Helsinki, Finland;Department of Computer Science, University of Helsinki, Finland;Department of Computer Science, University of Helsinki, Finland;Department of Computer Science, University of Helsinki, Finland

  • Venue:
  • Algorithms and Applications
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Mapping short DNA reads to the reference genome is the core task in the recent high-throughput technologies to study e.g. protein-DNA interactions (ChIP-seq) and alternative splicing (RNA-seq). Several tools for the task (bowtie, bwa, SOAP2, TopHat) have been developed that exploit Burrows-Wheeler transform and the backward backtracking technique on it, to map the reads to their best approximate occurrences in the genome. These tools use different tailored mechanisms for small error-levels to prune the search phase significantly. We propose a new pruning mechanism that can be seen a generalization of the tailored mechanisms used so far. It uses a novel idea of storing all cyclic rotations of fixed length substrings of the reference sequence with a compressed index that is able to exploit the repetitions created to level out the growth of the input set. For RNA-seq we propose a new method that combines dynamic programming with backtracking to map efficiently and correctly all reads that span two exons. Same mechanism can also be used for mapping mate-pair reads.