Data compression: methods and theory
Data compression: methods and theory
Linear approximation of shortest superstrings
Journal of the ACM (JACM)
Rotations of periodic strings and short superstrings
Journal of Algorithms
\boldmath A $2\frac12$-Approximation Algorithm for Shortest Superstring
SIAM Journal on Computing
Parallel and Sequential Approximations of Shortest Superstrings
SWAT '94 Proceedings of the 4th Scandinavian Workshop on Algorithm Theory
Improved Length Bounds for the Shortest Superstring Problem (Extended Abstract)
WADS '95 Proceedings of the 4th International Workshop on Algorithms and Data Structures
A 2 2/3-Approximation Algorithm for the Shortest Superstring Problem
CPM '96 Proceedings of the 7th Annual Symposium on Combinatorial Pattern Matching
Introduction to Bioinformatics
Introduction to Bioinformatics
Approximating shortest superstrings
SFCS '93 Proceedings of the 1993 IEEE 34th Annual Foundations of Computer Science
Long tours and short superstrings
SFCS '94 Proceedings of the 35th Annual Symposium on Foundations of Computer Science
Viral gene compression: complexity and verification
CIAA'04 Proceedings of the 9th international conference on Implementation and Application of Automata
Hi-index | 0.00 |
Viruses compress their genome to reduce space. One of the main techniques is overlapping genes. We model this process by the shortest common superstring problem. We give an algorithm for computing optimal solutions which is slow in the number of strings but fast (linear) in their total length. This algorithm is used for a number of viruses with relatively few genes. When the number of genes is larger, we compute approximate solutions using the greedy algorithm which gives an upper bound for the optimal solution. We give also a lower bound for the shortest common superstring problem. The results obtained are then compared with what happens in nature. Remarkably, the compression obtained by viruses is very close to the one achieved by modern computers.