Efficient randomized pattern-matching algorithms
IBM Journal of Research and Development - Mathematics and computing
De novo identification of repeat families in large genomes
Bioinformatics
ISCIS'06 Proceedings of the 21st international conference on Computer and Information Sciences
International Journal of Bioinformatics Research and Applications
Hi-index | 0.00 |
Over the last several years the search for functional genomicelements by exploiting motif over-representation becameincreasingly popular. However, about half of the human genome isrepetitive, and that is also the case with most higher eukaryotes.In this study we have shown that in addition to these knownrepeats, human sequences feature many short over-representedmotifs, and that their frequency varies only slightly betweenrandom repeat-masked sequences and regions located immediatelyupstream of the known genes. Most of our study has been performedon the ENCODE sequences, which comprise about 1% of the humangenome.