Opportunistic data structures with applications
FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
Linear pattern matching algorithms
SWAT '73 Proceedings of the 14th Annual Symposium on Switching and Automata Theory (swat 1973)
Parallel short sequence mapping for high throughput genome sequencing
IPDPS '09 Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing
Bioinformatics
FANGS: high speed sequence mapping for next generation sequencers
Proceedings of the 2010 ACM Symposium on Applied Computing
Bioinformatics
GPU-RMAP: Accelerating Short-Read Mapping on Graphics Processors
CSE '10 Proceedings of the 2010 13th IEEE International Conference on Computational Science and Engineering
Bioinformatics
Parallel Mapping Approaches for GNUMAP
IPDPSW '11 Proceedings of the 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and PhD Forum
Long read alignment based on maximal exact match seeds
Bioinformatics
Fast and accurate read alignment for resequencing
Bioinformatics
Hi-index | 0.00 |
Fast and robust algorithms and aligners have been developed to help the researchers in the analysis of genomic data whose size has been dramatically increased in the last decade due to the technological advancements in DNA sequencing. It was not only the size, but the characteristics of the data have been changed. One of the current concern is that the length of the reads is increasing. Although existing algorithms can still be used to process this fresh data, considering its size and changing structure, new and more efficient approaches are required. In this work, we address the problem of accurate sequence alignment on GPUs and propose a new tool, Masher, which processes long (and short) reads efficiently and accurately. The algorithm employs a novel indexing technique that produces an index for the 3, 137Mbp hg19 with a memory footprint small enough to be stored in a restricted-memory device such as a GPU. The results show that Masher is faster than state-of-the-art tools and obtains a good accuracy/sensitivity on sequencing data with various characteristics.