De novo repeat classification and fragment assembly

Authors:
Pavel A. Pevzner;Haixu Tang;Glenn Tesler
Affiliations:
University of California, San Diego, La Jolla, CA;University of California, San Diego, La Jolla, CA;University of California, San Diego, La Jolla, CA
Venue:
RECOMB '04 Proceedings of the eighth annual international conference on Resaerch in computational molecular biology
Year:
2004

Citing 3
Cited 6

Introduction to algorithms

Introduction to algorithms
Large scale sequencing by hybridization

RECOMB '01 Proceedings of the fifth annual international conference on Computational biology
Computation and Visualization of Degenerate Repeats in Complete Genomes

Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology

Correcting Base-Assignment Errors in Repeat Regions of Shotgun Assembly

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Detecting Repeat Families in Incompletely Sequenced Genomes

WABI '08 Proceedings of the 8th international workshop on Algorithms in Bioinformatics
Correcting short reads with high error rates for improved sequencing result

International Journal of Bioinformatics Research and Applications
Ab initio whole genome shotgun assembly with mated short reads

RECOMB'08 Proceedings of the 12th annual international conference on Research in computational molecular biology
Computability of models for sequence assembly

WABI'07 Proceedings of the 7th international conference on Algorithms in Bioinformatics
Efficient bubble enumeration in directed graphs

SPIRE'12 Proceedings of the 19th international conference on String Processing and Information Retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

Repetitive sequences make up a significant fraction of almost any genome and an important and still open question in bioinformatics is how to represent all repeats in DNA sequences. We propose a radically new approach to repeat classification that is motivated by the fundamental topological notion of quotient spaces. A torus or Klein bottle are examples of quotient spaces that can be obtained from a square by gluing some points. Our new repeat classification algorithm is based on the observation that the alignment-induced quotient space of a DNA sequence compactly represents all sequence repeats. This observation leads to a simple and efficient solution of the repeat classification problem as well as new approaches to fragment assembly and multiple alignment.