Automated trend analysis of proteomics data using an intelligent data mining architecture

Authors:
James Malone;Ken McGarry;Chris Bowerman
Affiliations:
Centre for Hybrid Intelligent Systems, School of Computing and Technology, University of Sunderland, StPeter's Way, Sunderland, SR6 0DD, UK;Centre for Hybrid Intelligent Systems, School of Computing and Technology, University of Sunderland, StPeter's Way, Sunderland, SR6 0DD, UK;Centre for Hybrid Intelligent Systems, School of Computing and Technology, University of Sunderland, StPeter's Way, Sunderland, SR6 0DD, UK
Venue:
Expert Systems with Applications: An International Journal
Year:
2006

Citing 7
Cited 2

Mechanisms of sentence processing: assigning roles to constituents

Parallel distributed processing: explorations in the microstructure of cognition, vol. 2
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Quantifiable data mining using ratio rules

The VLDB Journal — The International Journal on Very Large Data Bases
Predicting breast cancer survivability: a comparison of three data mining methods

Artificial Intelligence in Medicine
Neural network as a decision support system in the development of pharmaceutical formulation-focus on solid dispersions

Expert Systems with Applications: An International Journal
Mining the breast cancer pattern using artificial neural networks and multivariate adaptive regression splines

Expert Systems with Applications: An International Journal
Data mining in deductive databases using query flocks

Expert Systems with Applications: An International Journal

Research intelligence involving information retrieval - An example of conferences and journals

Expert Systems with Applications: An International Journal
Indices of novelty for emerging topic detection

Information Processing and Management: an International Journal

Quantified Score

Hi-index	12.05

Visualization

Abstract

Proteomics is a field dedicated to the analysis and identification of proteins within an organism. Within proteomics, two-dimensional electrophoresis (2-DE) is currently unrivalled as a technique to separate and analyse proteins from tissue samples. The analysis of post-experimental data produced from this technique has been identified as an important step within this overall process. Some of the long-term aims of this analysis are to identify targets for drug discovery and proteins associated with specific organism states. The large quantities of high-dimensional data produced from such experimentation requires expertise to analyse, which results in a processing bottleneck, limiting the potential of this approach. We present an intelligent data mining architecture that incorporates both data-driven and goal-driven strategies and is able to accommodate the spatial and temporal elements of the dataset under analysis. The architecture is able to automatically classify interesting proteins with a low number of false positives and false negatives. Using a data mining technique to detect variance within the data before classification offers performance advantages over other statistical variance techniques in the order of between 16 and 46%.