Exon prediction using empirical mode decomposition and Fourier transform of structural profiles of DNA sequences

  • Authors:
  • Wei-Feng Zhang;Hong Yan

  • Affiliations:
  • Department of Applied Mathematics, South China Agricultural University, 483 Wushan Road, Guangzhou 510642, China;Department of Electronic Engineering, City University of Hong Kong, 83 Tat Chee Avenue, Kowloon, Hong Kong

  • Venue:
  • Pattern Recognition
  • Year:
  • 2012

Quantified Score

Hi-index 0.01

Visualization

Abstract

Spectrum analysis approaches, such as the Fourier transform, wavelet transform and autoregressive model, have been successfully applied to solve the exon prediction problem due to their flexibility that requires no training data or prior knowledge. Detecting short exons is a difficult problem. The results achieved by the traditional methods are often unsatisfactory, because they cannot identify spectral patterns of short exons correctly. In this article, we propose an improved exon prediction method based on empirical mode decomposition and the Fourier transform. The proposed approach numerically represents the DNA sequences by their structural features, which can help to yield significant patterns that are rarely observed with the traditional methods. The structural profile is utilized to detect probable exons by examining the peaks of the local 1/3 frequency spectrum within a sliding window. The data in the window is firstly decomposed by empirical mode decomposition into a collection of intrinsic mode functions. Then the first intrinsic mode function is used to compute the local spectrum by fast Fourier transform. We compare our method with the traditional Fourier transform with binary representation method and the recently proposed paired spectral content method. Experiments on randomly selected Human genome dataset and the GENSCAN benchmark dataset illustrate that our method can enhance the signal-to-noise ratio of the analyzed sequences and improve the prediction accuracy of short exons.