Evaluating data mining algorithms using molecular dynamics trajectories

  • Authors:
  • Vasileios A. Tatsis;Christos Tjortjis;Panagiotis Tzirakis

  • Affiliations:
  • Department of Engineering Informatics and Telecommunications, Section of Applied Informatics, University of Western Macedonia, Vermiou & Ligeris, Kozani 50100, Greece;Department of Computer Science, University of Ioannina, Ioannina 45110, Greece;Department of Engineering Informatics and Telecommunications, Section of Applied Informatics, University of Western Macedonia, Vermiou & Ligeris, Kozani 50100, Greece

  • Venue:
  • International Journal of Data Mining and Bioinformatics
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Molecular dynamics simulations provide a sample of a molecule's conformational space. Experiments on the µs time scale, resulting in large amounts of data, are nowadays routine. Data mining techniques such as classification provide a way to analyse such data. In this work, we evaluate and compare several classification algorithms using three data sets which resulted from computer simulations, of a potential enzyme mimetic biomolecule. We evaluated 65 classifiers available in the well-known data mining toolkit Weka, using 'classification' errors to assess algorithmic performance. Results suggest that: i 'meta' classifiers perform better than the other groups, when applied to molecular dynamics data sets; ii Random Forest and Rotation Forest are the best classifiers for all three data sets; and iii classification via clustering yields the highest classification error. Our findings are consistent with bibliographic evidence, suggesting a 'roadmap' for dealing with such data.