Efficient algorithms for finding maximum matching in graphs
ACM Computing Surveys (CSUR)
The greedy path-merging algorithm for contig scaffolding
Journal of the ACM (JACM)
Opera: reconstructing optimal genomic scaffolds with high-throughput paired-end sequences
RECOMB'11 Proceedings of the 15th Annual international conference on Research in computational molecular biology
Bioinformatics
Fast and lossless graph division method for layout decomposition using SPQR-tree
Proceedings of the International Conference on Computer-Aided Design
Hi-index | 0.00 |
The rapidly diminishing cost of genome sequencing is driving renewed interest in large scale genome sequencing programs such as Genome 10K (G10K). Despite renewed interest the assembly of large genomes from short reads is still an extremely resource intensive process. This work presents a scalable algorithms to create scaffolds, or ordered and oriented sets of assembled contigs, which is one part of a practical assembly. This is accomplished using integer linear programming (ILP). In order to process large mammalian genomes we employ non-serial dynamic programming (NSDP) and a hierarchical strategy. Both existing and novel quantitative metrics are used to compare scaffolding tools and gain deeper insight into the challenges of scaffolding. The code is available at: https://bitbucket.org/jrl03001/silp