Settling the Polynomial Learnability of Mixtures of Gaussians
FOCS '10 Proceedings of the 2010 IEEE 51st Annual Symposium on Foundations of Computer Science
Hapsembler: an assembler for highly polymorphic genomes
RECOMB'11 Proceedings of the 15th Annual international conference on Research in computational molecular biology
RECOMB'11 Proceedings of the 15th Annual international conference on Research in computational molecular biology
Localized genome assembly from reads to scaffolds: practical traversal of the paired string graph
WABI'11 Proceedings of the 11th international conference on Algorithms in bioinformatics
IDBA: a practical iterative de bruijn graph de novo assembler
RECOMB'10 Proceedings of the 14th Annual international conference on Research in Computational Molecular Biology
From de bruijn graphs to rectangle graphs for genome assembly
WABI'12 Proceedings of the 12th international conference on Algorithms in Bioinformatics
Hi-index | 0.00 |
One of the key advances in genome assembly that has led to a significant improvement in contig lengths has been utilization of paired reads (mate-pairs). While in most assemblers, mate-pair information is used in a post-processing step, the recently proposed Paired de Bruijn Graph (PDBG) approach incorporates the mate-pair information directly in the assembly graph structure. However, the PDBG approach faces difficulties when the variation in the insert sizes is high. To address this problem, we first transform mate-pairs into edge-pair histograms that allow one to better estimate the distance between edges in the assembly graph that represent regions linked by multiple mate-pairs. Further, we combine the ideas of mate-pair transformation and PDBGs to construct new data structures for genome assembly: pathsets and pathset graphs.