The greedy path-merging algorithm for contig scaffolding
Journal of the ACM (JACM)
Journal of the ACM (JACM)
Assembling millions of short DNA sequences using SSAKE
Bioinformatics
Aggressive assembly of pyrosequencing reads with mates
Bioinformatics
A fast hybrid short read fragment assembly algorithm
Bioinformatics
The Sequence Alignment/Map format and SAMtools
Bioinformatics
Bioinformatics
Hapsembler: an assembler for highly polymorphic genomes
RECOMB'11 Proceedings of the 15th Annual international conference on Research in computational molecular biology
RECOMB'11 Proceedings of the 15th Annual international conference on Research in computational molecular biology
Pathset graphs: a novel approach for comprehensive utilization of paired reads in genome assembly
RECOMB'12 Proceedings of the 16th Annual international conference on Research in Computational Molecular Biology
Parallel and memory-efficient reads indexing for genome assembly
PPAM'11 Proceedings of the 9th international conference on Parallel Processing and Applied Mathematics - Volume Part II
Space-efficient and exact de bruijn graph representation based on a bloom filter
WABI'12 Proceedings of the 12th international conference on Algorithms in Bioinformatics
Hi-index | 0.00 |
Next-generation de novo short reads assemblers typically use the following strategy: (1) assemble unpaired reads using heuristics leading to contigs; (2) order contigs from paired reads information to produce scaffolds. We propose to unify these two steps by introducing localized assembly: direct construction of scaffolds from reads. To this end, the paired string graph structure is introduced, along with a formal framework for building scaffolds as paths of reads. This framework leads to the design of a novel greedy algorithm for memory-efficient, parallel assembly of paired reads. A prototype implementation of the algorithm has been developed and applied to the assembly of simulated and experimental short reads. Our experiments show that our methods yields longer scaffolds than recent assemblers, and is capable of assembling diploid genomes significantly better than other greedy methods.