Polynomial Time Inference of Extended Regular Pattern Languages
Proceedings of RIMS Symposium on Software Science and Engineering
STACS '94 Proceedings of the 11th Annual Symposium on Theoretical Aspects of Computer Science
Measuring over-generalization in the minimal multiple generalizations of biosequences
DS'05 Proceedings of the 8th international conference on Discovery Science
Developments from enquiries into the learnability of the pattern languages from positive data
Theoretical Computer Science
International Journal of Data Mining and Bioinformatics
ICGI'06 Proceedings of the 8th international conference on Grammatical Inference: algorithms and applications
Hi-index | 0.00 |
In this paper we examine the issues involved in finding consensus patterns from biosequence data of very small sample sizes, by searching for so-called minimal multiple generalization (mmg), that is, a set of syntactically minimal patterns that accounts for all the samples. The data we use are the sigma regulons with more conserved consensus patterns for the bacteria B. subtilis. By comparing between the mmgs found over different search spaces, we found that it is possible to derive patterns close to the known consensus patterns by simply making some reasonable requirements on the kinds of patterns to obtain. We also propose some simple measures to evaluate the patterns in an mmg.