Finding consensus patterns in very scarce biosequence samples from their minimal multiple generalizations

  • Authors:
  • Yen Kaow Ng;Takeshi Shinohara

  • Affiliations:
  • Graduate School of Computer Science and Systems, Kyushu Institute of Technology, Iizuka, Japan;Department of Artificial Intelligence, Kyushu Institute of Technology, Iizuka, Japan

  • Venue:
  • PAKDD'06 Proceedings of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we examine the issues involved in finding consensus patterns from biosequence data of very small sample sizes, by searching for so-called minimal multiple generalization (mmg), that is, a set of syntactically minimal patterns that accounts for all the samples. The data we use are the sigma regulons with more conserved consensus patterns for the bacteria B. subtilis. By comparing between the mmgs found over different search spaces, we found that it is possible to derive patterns close to the known consensus patterns by simply making some reasonable requirements on the kinds of patterns to obtain. We also propose some simple measures to evaluate the patterns in an mmg.