Automated trend analysis of proteomics data using an intelligent data mining architecture

  • Authors:
  • James Malone;Ken McGarry;Chris Bowerman

  • Affiliations:
  • Centre for Hybrid Intelligent Systems, School of Computing and Technology, University of Sunderland, StPeter's Way, Sunderland, SR6 0DD, UK;Centre for Hybrid Intelligent Systems, School of Computing and Technology, University of Sunderland, StPeter's Way, Sunderland, SR6 0DD, UK;Centre for Hybrid Intelligent Systems, School of Computing and Technology, University of Sunderland, StPeter's Way, Sunderland, SR6 0DD, UK

  • Venue:
  • Expert Systems with Applications: An International Journal
  • Year:
  • 2006

Quantified Score

Hi-index 12.05

Visualization

Abstract

Proteomics is a field dedicated to the analysis and identification of proteins within an organism. Within proteomics, two-dimensional electrophoresis (2-DE) is currently unrivalled as a technique to separate and analyse proteins from tissue samples. The analysis of post-experimental data produced from this technique has been identified as an important step within this overall process. Some of the long-term aims of this analysis are to identify targets for drug discovery and proteins associated with specific organism states. The large quantities of high-dimensional data produced from such experimentation requires expertise to analyse, which results in a processing bottleneck, limiting the potential of this approach. We present an intelligent data mining architecture that incorporates both data-driven and goal-driven strategies and is able to accommodate the spatial and temporal elements of the dataset under analysis. The architecture is able to automatically classify interesting proteins with a low number of false positives and false negatives. Using a data mining technique to detect variance within the data before classification offers performance advantages over other statistical variance techniques in the order of between 16 and 46%.