ExpertDiscovery system application for the hierarchical analysis of eukaryotic transcription regulatory regions based on DNA codes of transcription

  • Authors:
  • I. V. Khomicheva;E. E. Vityaev;E. A. Ananko;T. I. Shipilov;V. G. Levitsky

  • Affiliations:
  • (Correspd. E-mail: khomicheva@bionet.nsc.ru) Institute of Cytology and Genetics SB RAS, Lavrentyev aven., 10, Novosibirsk, 630090, Russia and Sobolev Institute of Mathematics, Koptyug aven. 4, Nov ...;Sobolev Institute of Mathematics, Koptyug aven. 4, Novosibirsk, 630090, Russia;Institute of Cytology and Genetics SB RAS, Lavrentyev aven., 10, Novosibirsk, 630090, Russia;Novosibirsk State University, Pirogova, 2, Novosibirsk, 630090, Russia;Institute of Cytology and Genetics SB RAS, Lavrentyev aven., 10, Novosibirsk, 630090, Russia and Novosibirsk State University, Pirogova, 2, Novosibirsk, 630090, Russia

  • Venue:
  • Intelligent Data Analysis - New Methods in Bioinformatics Presented at the Fifth International Conference on Bioinformatics of Genome Regulation and Structure
  • Year:
  • 2008

Quantified Score

Hi-index 0.03

Visualization

Abstract

We developed Relational Data Mining approach which allows to overcome essential limitations of the Data Mining and Knowledge Discovery techniques. In the paper the approach was implemented to adapt the original `Discovery' system to the computational biology needs. The objects under consideration, eukaryotic transcription regulatory regions, are characterized by the great variety of context physicochemical and conformational DNA features. The currently available tools aimed at the regulatory regions analysis are sensitive to specific DNA features; therefore they produce poor results on complex heterogeneous data. Development of a method integrating the results of different recognition programs is a challenging task. We have developed the `ExpertDiscovery' system, which discovers the hierarchically complicating set of complex signals based on different elementary signals. It provides a powerful tool to construct a model of regulatory region generalizing the results of different programs. Besides, the system is an independent tool for analysis. In the paper we demonstrate that `ExpertDiscovery' outperforms the position weight matrix in the case when the elementary signals introduced to the system are nucleotides at specific positions. The system is able to discover biologically significant, simple to complex models of potential transcription factor binding sites for regulatory regions of interferon-inducible genes.