Analysis of DNA sequence transformations on grids

  • Authors:
  • Yadnyesh Joshi;Sathish Vadhiyar

  • Affiliations:
  • Supercomputer Education and Research Centre, Indian Institute of Science, Bangalore - 560012, India;Supercomputer Education and Research Centre, Indian Institute of Science, Bangalore - 560012, India

  • Venue:
  • Journal of Parallel and Distributed Computing
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Study of the evolution of species or organisms is essential for various biological applications. Evolution is typically studied at the molecular level by analyzing the mutations of DNA sequences of organisms. Techniques have been developed for building phylogenetic or evolutionary trees for a set of sequences. Though phylogenetic trees capture the overall evolutionary relationships among the sequences, they do not reveal fine-level details of the evolution. In this work, we attempt to resolve various fine-level sequence transformation details associated with a phylogenetic tree using cellular automata. In particular, our work tries to determine the cellular automata rules for neighbor-dependent mutations of segments of DNA sequences. We also determine the number of time steps needed for evolution of a progeny from an ancestor and the unknown segments of the intermediate sequences in the phylogenetic tree. Due to the existence of vast number of cellular automata rules, we have developed a grid system that performs parallel guided explorations of the rules on grid resources. We demonstrate our techniques by conducting experiments on a grid comprising machines in three countries and obtaining potentially useful statistics regarding evolutions in three HIV sequences. In particular, our work is able to verify the phenomenon of neighbor-dependent mutations and find that certain combinations of neighbor-dependent mutations, defined by a cellular automata rule, occur with greater than 90% probability. We also find the average number of time steps for mutations for some branches of phylogenetic tree over a large number of possible transformations with standard deviations less than 2.