Anchoring millions of distinct reads on the human genome within seconds
Proceedings of the 13th International Conference on Extending Database Technology
REAL: an efficient REad ALigner for next generation sequencing reads
Proceedings of the First ACM International Conference on Bioinformatics and Computational Biology
Approximate all-pairs suffix/prefix overlaps
CPM'10 Proceedings of the 21st annual conference on Combinatorial pattern matching
Fast mapping and precise alignment of AB SOLiD color reads to reference DNA
WABI'10 Proceedings of the 10th international conference on Algorithms in bioinformatics
Design of an efficient out-of-core read alignment algorithm
WABI'10 Proceedings of the 10th international conference on Algorithms in bioinformatics
MapReducing a genomic sequencing workflow
Proceedings of the second international workshop on MapReduce and its applications
Seed-set construction by equi-entropy partitioning for efficient and sensitive short-read mapping
WABI'11 Proceedings of the 11th international conference on Algorithms in bioinformatics
Indexing finite language representation of population genotypes
WABI'11 Proceedings of the 11th international conference on Algorithms in bioinformatics
Journal of Parallel and Distributed Computing
An algorithm for mapping short reads to a dynamically changing genomic sequence
Journal of Discrete Algorithms
Accelerating short read mapping on an FPGA (abstract only)
Proceedings of the ACM/SIGDA international symposium on Field Programmable Gate Arrays
DynMap: mapping short reads to multiple related genomes
Proceedings of the 2nd ACM Conference on Bioinformatics, Computational Biology and Biomedicine
Seed design framework for mapping SOLiD reads
RECOMB'10 Proceedings of the 14th Annual international conference on Research in Computational Molecular Biology
Accurate estimation of expression levels of homologous genes in RNA-seq experiments
RECOMB'10 Proceedings of the 14th Annual international conference on Research in Computational Molecular Biology
Approximate all-pairs suffix/prefix overlaps
Information and Computation
A randomized numerical aligner (rNA)
LATA'10 Proceedings of the 4th international conference on Language and Automata Theory and Applications
Unified view of backward backtracking in short read mapping
Algorithms and Applications
Personal genomes: a new frontier in database research
DNIS'11 Proceedings of the 7th international conference on Databases in Networked Information Systems
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Memory-Aware BWT by segmenting sequences to support subsequence search
APWeb'12 Proceedings of the 14th Asia-Pacific international conference on Web Technologies and Applications
Proceedings of the 2012 Joint EDBT/ICDT Workshops
POPE: pipeline of parentally-biased expression
ISBRA'12 Proceedings of the 8th international conference on Bioinformatics Research and Applications
A generic framework for efficient and effective subsequence retrieval
Proceedings of the VLDB Endowment
Massive genomic data processing and deep analysis
Proceedings of the VLDB Endowment
A randomized Numerical Aligner (rNA)
Journal of Computer and System Sciences
Lyndon fountains and the Burrows-Wheeler transform
Proceedings of the CUBE International Information Technology Conference
Efficient SNP-sensitive alignment and database-assisted SNP calling for low coverage samples
Proceedings of the ACM Conference on Bioinformatics, Computational Biology and Biomedicine
High-performance short sequence alignment with GPU acceleration
Distributed and Parallel Databases
WHAM: A High-Throughput Sequence Alignment Method
ACM Transactions on Database Systems (TODS)
Efficient indexing algorithms for approximate pattern matching in text
Proceedings of the Seventeenth Australasian Document Computing Symposium
Comparing DNA sequence collections by direct comparison of compressed text indexes
WABI'12 Proceedings of the 12th international conference on Algorithms in Bioinformatics
Computing the longest common prefix array based on the Burrows-Wheeler transform
Journal of Discrete Algorithms
Acceleration of the long read mapping on a PC-FPGA architecture (abstract only)
Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays
Scalable string similarity search/join with approximate seeds and multiple backtracking
Proceedings of the Joint EDBT/ICDT 2013 Workshops
Cache-aware parallel approximate matching and join algorithms using BWT
Proceedings of the Joint EDBT/ICDT 2013 Workshops
Hardware acceleration of genetic sequence alignment
ARC'13 Proceedings of the 9th international conference on Reconfigurable Computing: architectures, tools, and applications
Inference of tumor phylogenies with improved somatic mutation discovery
RECOMB'13 Proceedings of the 17th international conference on Research in Computational Molecular Biology
Proceedings of the Conference on Extreme Science and Engineering Discovery Environment: Gateway to Discovery
Parallel efficient aligner of pyrosequencing reads
Proceedings of the 20th European MPI Users' Group Meeting
Parallel architecture for DNA sequence inexact matching with Burrows-Wheeler Transform
Microelectronics Journal
GapsMis: flexible sequence alignment with a bounded number of gaps
Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics
SpliceGrapherXT: From Splice Graphs to Transcripts Using RNA-Seq
Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics
Masher: Mapping Long(er) Reads with Hash-based Genome Indexing on GPUs
Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics
Computers in Biology and Medicine
RCSI: scalable similarity search in thousand(s) of genomes
Proceedings of the VLDB Endowment
Characterizing workflow-based activity on a production e-infrastructure using provenance data
Future Generation Computer Systems
Workload characteristics of DNA sequence analysis: from storage systems' perspective
Proceedings of the 6th Workshop on Rapid Simulation and Performance Evaluation: Methods and Tools
Multi-pattern matching with bidirectional indexes
Journal of Discrete Algorithms
Frequency-based re-sequencing tool for short reads on graphics processing units
International Journal of Computational Science and Engineering
A Compressed Suffix Tree Based Implementation With Low Peak Memory Usage
Electronic Notes in Theoretical Computer Science (ENTCS)
Computing the Burrows-Wheeler transform of a string and its reverse in parallel
Journal of Discrete Algorithms
Hi-index | 3.84 |
Motivation: The enormous amount of short reads generated by the new DNA sequencing technologies call for the development of fast and accurate read alignment programs. A first generation of hash table-based methods has been developed, including MAQ, which is accurate, feature rich and fast enough to align short reads from a single individual. However, MAQ does not support gapped alignment for single-end reads, which makes it unsuitable for alignment of longer reads where indels may occur frequently. The speed of MAQ is also a concern when the alignment is scaled up to the resequencing of hundreds of individuals. Results: We implemented Burrows-Wheeler Alignment tool (BWA), a new read alignment package that is based on backward search with Burrows–Wheeler Transform (BWT), to efficiently align short sequencing reads against a large reference sequence such as the human genome, allowing mismatches and gaps. BWA supports both base space reads, e.g. from Illumina sequencing machines, and color space reads from AB SOLiD machines. Evaluations on both simulated and real data suggest that BWA is ~10–20× faster than MAQ, while achieving similar accuracy. In addition, BWA outputs alignment in the new standard SAM (Sequence Alignment/Map) format. Variant calling and other downstream analyses after the alignment can be achieved with the open source SAMtools software package. Availability: http://maq.sourceforge.net Contact: rd@sanger.ac.uk