A Space-Economical Suffix Tree Construction Algorithm
Journal of the ACM (JACM)
Motifs in Ziv-Lempel-Welch Clef
DCC '04 Proceedings of the Conference on Data Compression
OASIS: an online and accurate technique for local-alignment searches on biological sequences
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Detection of subtle variations as consensus motifs
Theoretical Computer Science
VARUN: Discovering Extensible Motifs under Saturation Constraints
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Bridging lossy and lossless compression by motif pattern discovery
General Theory of Information Transfer and Combinatorics
A universal algorithm for sequential data compression
IEEE Transactions on Information Theory
Whole-Genome Phylogeny by Virtue of Unic Subwords
DEXA '12 Proceedings of the 2012 23rd International Workshop on Database and Expert Systems Applications
Hi-index | 0.00 |
The information theory has been used for quite some time in the area of computational biology. In this paper we discuss and improve the function Entropic Profile, introduced by Vinga and Almeida in [23]. The Entropic Profiler is a function of the genomic location that captures the importance of that region with respect to the whole genome. We provide a linear time linear space algorithm called Fast Entropic Profile, as opposed to the original quadratic implementation. Moreover we propose an alternative normalization that can be also efficiently implemented. We show that Fast EP is suitable for large genomes and for the discovery of motifs with unbounded length.