Data Mining for Case-Based Reasoning in High-Dimensional Biological Domains
IEEE Transactions on Knowledge and Data Engineering
Journal of Biomedical Informatics
Combinatorial Approaches for Mass Spectra Recalibration
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Classification of proteomic data with multiclass Logistic Partial Least Squares algorithm
International Journal of Bioinformatics Research and Applications
A Machine Learning Approach to Mass Spectra Classification with Unsupervised Feature Selection
Computational Intelligence Methods for Bioinformatics and Biostatistics
Learning causal models for noisy biological data mining: an application to ovarian cancer detection
AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
Efficient Peak-Labeling Algorithms for Whole-Sample Mass Spectrometry Proteomics
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Alignment of mass spectrometry data by clique finding and optimization
RECOMB'06 Proceedings of the joint 2006 satellite conference on Systems biology and computational proteomics
CAIP'07 Proceedings of the 12th international conference on Computer analysis of images and patterns
RISC: a new filter approach for feature selection from proteomic data
ICMB'08 Proceedings of the 1st international conference on Medical biometrics
ICCSA'07 Proceedings of the 2007 international conference on Computational science and its applications - Volume Part III
Computer Methods and Programs in Biomedicine
Feature detection techniques for preprocessing proteomic data
Journal of Biomedical Imaging - Special issue on mathematical methods for images and surfaces
Comparison of metaheuristic strategies for peakbin selection in proteomic mass spectrometry data
Information Sciences: an International Journal
Hi-index | 3.85 |
Motivation: There has been much interest in using patterns derived from surface-enhanced laser desorption and ionization (SELDI) protein mass spectra from serum to differentiate samples from patients both with and without disease. Such patterns have been used without identification of the underlying proteins responsible. However, there are questions as to the stability of this procedure over multiple experiments. Results: We compared SELDI proteomic spectra from serum from three experiments by the same group on separating ovarian cancer from normal tissue. These spectra are available on the web at http://clinicalproteomics.steem.com. In general, the results were not reproducible across experiments. Baseline correction prevents reproduction of the results for two of the experiments. In one experiment, there is evidence of a major shift in protocol mid-experiment which could bias the results. In another, structure in the noise regions of the spectra allows us to distinguish normal from cancer, suggesting that the normals and cancers were processed differently. Sets of features found to discriminate well in one experiment do not generalize to other experiments. Finally, the mass calibration in all three experiments appears suspect. Taken together, these and other concerns suggest that much of the structure uncovered in these experiments could be due to artifacts of sample processing, not to the underlying biology of cancer. We provide some guidelines for design and analysis in experiments like these to ensure better reproducible, biologically meaningfully results. Availability: The MATLAB and Perl code used in our analyses is available at http://bioinformatics.mdanderson.org