New Fast Decision Tree Classifier for Identifying Protein Coding Regions

  • Authors:
  • Hazem M. El-Bakry;Mohamed Hamada

  • Affiliations:
  • Faculty of Computer Science & Information Systems, Mansoura University, Egypt;University of Aizu, Aizu Wakamatsu, Japan

  • Venue:
  • ISICA '08 Proceedings of the 3rd International Symposium on Advances in Computation and Intelligence
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, a fast tool for finding protein coding regions is presented. Such tool relies on performing cross correlation in the frequency domain and decision Tree. In addition, a modified trust region method is used to find the closet (optimized) DNA nucleotide. Moreover, a Sequential PRM-based protein folding algorithm for finding the point where these proteins add to the ladder is introduced. Furthermore, standard parallel scan algorithm is used to provide parallel processing of the strides and its transitions. This proposed tool produces more accurate results, than that have previously been obtained for a range of different sequence lengths. Experimental results confirm the scalability of the proposed classifying tool to handle large volume of datasets irrespective of the number of classes, tuples and attributes. High classification accuracy is achieved. The main achievement in this paper is the fast decision tree algorithm. Such algorithm relies on performing cross correlation in the frequency domain between the input data at each node and the input weights of neural networks. It is proved mathematically and practically that the number of computation steps required for the presented FNNs is less than that needed by conventional neural networks (CNNs). Simulation results using MATLAB confirm the theoretical computations.