Search and clustering orders of magnitude faster than BLAST

Authors:
Robert C. Edgar
Affiliations:
-
Venue:
Bioinformatics
Year:
2010

Citing 0
Cited 5

CRiSPy-CUDA: computing species richness in 16S rRNA pyrosequencing datasets with CUDA

PRIB'11 Proceedings of the 6th IAPR international conference on Pattern recognition in bioinformatics
Towards systolic hardware acceleration for local complexity analysis of massive genomic data

Proceedings of the great lakes symposium on VLSI
DACIDR: deterministic annealed clustering with interpolative dimension reduction using a large collection of 16S rRNA sequences

Proceedings of the ACM Conference on Bioinformatics, Computational Biology and Biomedicine
Predicting v(d)j recombination using conditional random fields

PRIB'12 Proceedings of the 7th IAPR international conference on Pattern Recognition in Bioinformatics
FPGA-based hardware acceleration for local complexity analysis of massive genomic data

Integration, the VLSI Journal

Quantified Score

Hi-index	3.84

Visualization

Abstract

Motivation: Biological sequence data is accumulating rapidly, motivating the development of improved high-throughput methods for sequence classification. Results: UBLAST and USEARCH are new algorithms enabling sensitive local and global search of large sequence databases at exceptionally high speeds. They are often orders of magnitude faster than BLAST in practical applications, though sensitivity to distant protein relationships is lower. UCLUST is a new clustering method that exploits USEARCH to assign sequences to clusters. UCLUST offers several advantages over the widely used program CD-HIT, including higher speed, lower memory use, improved sensitivity, clustering at lower identities and classification of much larger datasets. Availability: Binaries are available at no charge for non-commercial use at http://www.drive5.com/usearch Contact: robert@drive5.com Supplementary information:Supplementary data are available at Bioinformatics online.