Analysis of the predictability of time series obtained from genomic sequences by using several predictors

  • Authors:
  • Horia-Nicolai Teodorescu;Lucian-Iulian Fira

  • Affiliations:
  • (Correspd. hteodor@etc.tuiasi.ro) Technical University of Iasi, Romania and Institute for Theoretical Computer Science of the Romanian Academy, Romania;Institute for Theoretical Computer Science of the Romanian Academy, Romania

  • Venue:
  • Journal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology - Soft Computing and Applications
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

In previous papers, we used one-step-ahead predictors for the genomic sequence recognition scores computation. The genomic sequences are coded as distances between successive bases. The recognition scores were then used as inputs for a hierarchical decision system. The relevance of these scores might be affected by the prediction quality. It is necessary to appreciate the prediction performance in a framework based on the analyzed time series predictability. The aim of this paper is to determine which predictors are most suitable for genomic sequence identification. We analyze linear predictors (like linear combiner), neuronal predictors (RBF or MLP type), and neuro-fuzzy predictors (Yamakawa model based). Several methods to appreciate the predictability of time series are used, like Hurst exponent, self-correlation function, and eta metric. All predictors were tested and compared for prediction quality using sequences from HIV-1 genome. The mean square prediction error (MSPE), direction test, and Theil coefficient were used as prediction performance measures. The prediction results obtained with the predictors are contrasted and discussed.