Discovering Frequent Structured Patterns from String Databases: An Application to Biological Sequences

Authors:
Luigi Palopoli;Giorgio Terracina
Affiliations:
-;-
Venue:
DS '02 Proceedings of the 5th International Conference on Discovery Science
Year:
2002

Citing 6
Cited 0

An efficient algorithm for the All Pairs Suffix-Prefix Problem

Information Processing Letters
Dynamic dictionary matching

Journal of Computer and System Sciences
Algorithms on strings, trees, and sequences: computer science and computational biology

Algorithms on strings, trees, and sequences: computer science and computational biology
An algorithm for finding tandem repeats of unspecified pattern size

RECOMB '98 Proceedings of the second annual international conference on Computational molecular biology
Identifying satellites in nucleic acid sequences

RECOMB '98 Proceedings of the second annual international conference on Computational molecular biology
String pattern matching for a deluge survival kit

Handbook of massive data sets

Quantified Score

Hi-index	0.00

Visualization

Abstract

In the last years, the completion of the human genome sequencing showed up a wide range of new challenging issues involving raw data analysis. In particular, the discovery of information implicitly encoded in biological sequences is assuming a prominent role in identifying genetic diseases and in deciphering biological mechanisms. This information is usually represented by patterns frequently occurring in the sequences. Because of biological observations, a specific class of patterns is becoming particularly interesting: frequent structured patterns. In this respect, it is biologically meaningful to look at both "exact" and "approximate" repetitions of the patterns within the available sequences.This paper gives a contribution in this setting by providing some algorithms which allow to discover frequent structured patterns, either in "exact" or "approximate" form, present in a collection of input biological sequences.