De novo repeat classification and fragment assembly
RECOMB '04 Proceedings of the eighth annual international conference on Resaerch in computational molecular biology
Fragment assembly with short reads
Bioinformatics
The fragment assembly string graph
Bioinformatics
Genomic Signatures in de Bruijn chains
WABI'07 Proceedings of the 7th international conference on Algorithms in Bioinformatics
Hi-index | 0.00 |
Repeats form a major class of sequence in genomes with implications for functional genomics and practical problems. Their detection and analysis pose a number of challenges in genomic sequence analysis, especially if the genome is not completely sequenced. The most abundant and evolutionary active forms of repeats are found in the form of familiesof long similar sequences. We present a novel method for repeat family detection and characterization in cases where the target genome sequence is not completely known. Therefore we first establish the sequence graph, a compacted version of sparse de Bruijn graphs. Using appropriate analysis of the structure of this graph and its connected components after local modifications, we are able to devise two algorithms for repeat family detection. The applicability of the methods is shown for both simulated and real genomic data sets.