WHAM: A High-Throughput Sequence Alignment Method

Authors:
Yinan Li;Jignesh M. Patel;Allison Terrell
Affiliations:
University of Wisconsin--Madison;University of Wisconsin--Madison;University of Wisconsin--Madison
Venue:
ACM Transactions on Database Systems (TODS)
Year:
2012

Citing 19
Cited 1

Algorithms for approximate string matching

Information and Control
A new approach to text searching

Communications of the ACM
Fast text searching: allowing errors

Communications of the ACM
Fast and practical approximate string matching

Information Processing Letters
A fast bit-vector algorithm for approximate string matching based on dynamic programming

Journal of the ACM (JACM)
A note on compiling fixed point binary multiplications

Communications of the ACM
A technique for counting ones in a binary computer

Communications of the ACM
A guided tour to approximate string matching

ACM Computing Surveys (CSUR)
n-gram/2L: a space and time efficient two-level n-gram inverted index structure

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Efficient exact set-similarity joins

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Fast nGram-based string search over data encoded using algebraic signatures

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
VGRAM: improving performance of approximate queries on string collections using variable-length grams

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Cost-based variable-length-gram selection for string collections to support approximate queries efficiently

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Fast and accurate short read alignment with Burrows–Wheeler transform

Bioinformatics
SOAP2

Bioinformatics
Reference-based alignment in large sequence databases

Proceedings of the VLDB Endowment
Fast and SNP-tolerant detection of complex variants and splicing in short reads

Bioinformatics
The Art of Computer Programming: Combinatorial Algorithms, Part 1

The Art of Computer Programming: Combinatorial Algorithms, Part 1
WHAM: a high-throughput sequence alignment method

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data

Streaming similarity search over one billion tweets using parallel locality-sensitive hashing

Proceedings of the VLDB Endowment

Quantified Score

Hi-index	0.00

Visualization

Abstract

Over the last decade, the cost of producing genomic sequences has dropped dramatically due to the current so-called next-generation sequencing methods. However, these next-generation sequencing methods are critically dependent on fast and sophisticated data processing methods for aligning a set of query sequences to a reference genome using rich string matching models. The focus of this work is on the design, development and evaluation of a data processing system for this crucial “short read alignment” problem. Our system, called WHAM, employs hash-based indexing methods and bitwise operations for sequence alignments. It allows rich match models and it is significantly faster than the existing state-of-the-art methods. In addition, its relative speedup over the existing method is poised to increase in the future in which read sequence lengths will increase.