Algorithms on strings, trees, and sequences: computer science and computational biology
Algorithms on strings, trees, and sequences: computer science and computational biology
Bioinformatics
REAL: an efficient REad ALigner for next generation sequencing reads
Proceedings of the First ACM International Conference on Bioinformatics and Computational Biology
Efficient representation of DNA data for pattern recognition using failure factor oracles
Proceedings of the South African Institute for Computer Scientists and Information Technologists Conference
Hi-index | 0.00 |
Next-generation sequencing technologies have redefined the way genome sequencing is performed. They are able to produce tens of millions of short sequences (reads), during a single experiment, and with a much lower cost than previously possible. Due to the dramatic increase in the amount of data generated, a challenging task is to map (align) a set of reads to a reference genome. In this paper, we study a different version of this problem: mapping these reads to a dynamically changing genomic sequence. We propose a new practical algorithm, which employs a suitable data structure that takes into account potential dynamic effects (replacements, insertions, deletions) on the genomic sequence. The presented experimental results demonstrate that the proposed approach can be extended and applied to address the problem of mapping short reads to multiple related genomes.