The nature of statistical learning theory
The nature of statistical learning theory
An introduction to support Vector Machines: and other kernel-based learning methods
An introduction to support Vector Machines: and other kernel-based learning methods
Feature Selection for Knowledge Discovery and Data Mining
Feature Selection for Knowledge Discovery and Data Mining
The use of receiver operating characteristic curves in biomedical informatics
Journal of Biomedical Informatics - Special issue: Clinical machine learning
Journal of Biomedical Informatics
Guilt-by-association feature selection: Identifying biomarkers from proteomic profiles
Journal of Biomedical Informatics
Closed loop knowledge discovery for decision support in intensive care medicine
ICCOMP'09 Proceedings of the WSEAES 13th international conference on Computers
Introducing intelligence in electronic healthcare systems: state of the art and future trends
Artificial intelligence
Computer Methods and Programs in Biomedicine
Global optimization of support vector machines using genetic algorithms for bankruptcy prediction
ICONIP'06 Proceedings of the 13th international conference on Neural information processing - Volume Part III
Artificial Intelligence in Medicine
CompLife'06 Proceedings of the Second international conference on Computational Life Sciences
ProSpect: an R package for analyzing SELDI measurements identifying protein biomarkers
CompLife'05 Proceedings of the First international conference on Computational Life Sciences
Learning rules with complex temporal patterns in biomedical domains
AIME'05 Proceedings of the 10th conference on Artificial Intelligence in Medicine
Genetic programming for biomarker detection in mass spectrometry data
AI'12 Proceedings of the 25th Australasian joint conference on Advances in Artificial Intelligence
EvoBIO'13 Proceedings of the 11th European conference on Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics
Using decision tree for diagnosing heart disease patients
AusDM '11 Proceedings of the Ninth Australasian Data Mining Conference - Volume 121
Review: Knowledge discovery in medicine: Current issue and future trend
Expert Systems with Applications: An International Journal
Hi-index | 0.00 |
Objective: Pathological changes in an organ or tissue may be reflected in proteomic patterns in serum. It is possible that unique serum proteomic patterns could be used to discriminate cancer samples from non-cancer ones. Due to the complexity of proteomic profiling, a higher order analysis such as data mining is needed to uncover the differences in complex proteomic patterns. The objectives of this paper are (1) to briefly review the application of data mining techniques in proteomics for cancer detection/diagnosis; (2) to explore a novel analytic method with different feature selection methods; (3) to compare the results obtained on different datasets and that reported by Petricoin et al. in terms of detection performance and selected proteomic patterns. Methods and material: Three serum SELDI MS data sets were used in this research to identify serum proteomic patterns that distinguish the serum of ovarian cancer cases from non-cancer controls. A support vector machine-based method is applied in this study, in which statistical testing and genetic algorithm-based methods are used for feature selection respectively. Leave-one-out cross validation with receiver operating characteristic (ROC) curve is used for evaluation and comparison of cancer detection performance. Results and conclusions: The results showed that (1) data mining techniques can be successfully applied to ovarian cancer detection with a reasonably high performance; (2) the classification using features selected by the genetic algorithm consistently outperformed those selected by statistical testing in terms of accuracy and robustness; (3) the discriminatory features (proteomic patterns) can be very different from one selection method to another. In other words, the pattern selection and its classification efficiency are highly classifier dependent. Therefore, when using data mining techniques, the discrimination of cancer from normal does not depend solely upon the identity and origination of cancer-related proteins.