A multiple kernel support vector machine scheme for feature selection and rule extraction from gene expression data of cancer tissue

Authors:
Zhenyu Chen;Jianping Li;Liwei Wei
Affiliations:
Institute of Policy & Management, Chinese Academy of Sciences, Beijing 100080, China and Graduate University of Chinese Academy of Sciences, Beijing 100039, China;Institute of Policy & Management, Chinese Academy of Sciences, Beijing 100080, China;Institute of Policy & Management, Chinese Academy of Sciences, Beijing 100080, China and Graduate University of Chinese Academy of Sciences, Beijing 100039, China
Venue:
Artificial Intelligence in Medicine
Year:
2007

Citing 35
Cited 26

The nature of statistical learning theory

The nature of statistical learning theory
Selection of relevant features and examples in machine learning

Artificial Intelligence - Special issue on relevance
Wrappers for feature subset selection

Artificial Intelligence - Special issue on relevance
Linear Programming Boosting via Column Generation

Machine Learning
Choosing Multiple Parameters for Support Vector Machines

Machine Learning
Gene Selection for Cancer Classification using Support Vector Machines

Machine Learning
Structural Modelling with Sparse Kernels

Machine Learning
Symbolic Interpretation of Artificial Neural Networks

IEEE Transactions on Knowledge and Data Engineering
Generalized Analytic Rule Extraction for Feedforward Neural Networks

IEEE Transactions on Knowledge and Data Engineering
Analysis and Visualization of Gene Expression Microarray Data in Human Cancer Using Self-Organizing Maps

Machine Learning
Consensus Clustering: A Resampling-Based Method for Class Discovery and Visualization of Gene Expression Microarray Data

Machine Learning
Microarray data mining: facing the challenges

ACM SIGKDD Explorations Newsletter
Learning the Kernel Matrix with Semidefinite Programming

The Journal of Machine Learning Research
A theoretical characterization of linear SVM-based feature selection

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Multiple kernel learning, conic duality, and the SMO algorithm

ICML '04 Proceedings of the twenty-first international conference on Machine learning
A Hybrid SOM-SVM Method for Analyzing Zebra Fish Gene Expression

ICPR '04 Proceedings of the Pattern Recognition, 17th International Conference on (ICPR'04) Volume 2 - Volume 02
Rule extraction from linear support vector machines

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Predictive neural networks for gene expression data analysis

Neural Networks
Learning the Kernel Function via Regularization

The Journal of Machine Learning Research
Construction of robust prognostic predictors by using projective adaptive resonance theory as a gene filtering method

Bioinformatics
Semi-Supervised Mixture of Kernels via LPBoost Methods

ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
GenSo-FDSS: a neural-fuzzy decision support system for pediatric ALL cancer subtype identification using gene expression data

Artificial Intelligence in Medicine
SVM Soft Margin Classifiers: Linear Programming versus Quadratic Programming

Neural Computation
Geometrical Properties of Nu Support Vector Machines with Different Norms

Neural Computation
Regulation probability method for gene selection

Pattern Recognition Letters
TreeDT: Tree Pattern Mining for Gene Mapping

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Incremental wrapper-based gene selection from microarray data for cancer classification

Pattern Recognition
The classification of cancer based on DNA microarray data that uses diverse ensemble genetic programming

Artificial Intelligence in Medicine
Evolutionary tuning of multiple SVM parameters

Neurocomputing
Learning interpretable SVMs for biological sequence classification

RECOMB'05 Proceedings of the 9th Annual international conference on Research in Computational Molecular Biology
Feature subset selection for support vector machines through discriminative function pruning analysis

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Wrapper–Filter Feature Selection Algorithm Using a Memetic Framework

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Evolving connectionist systems for knowledge discovery from gene expression data of cancer tissue

Artificial Intelligence in Medicine
On connectionism, rule extraction, and brain-like learning

IEEE Transactions on Fuzzy Systems
Efficient tuning of SVM hyperparameters using radius/margin bound and iterative algorithms

IEEE Transactions on Neural Networks

Guest editorial: Integrative data mining in systems biology: from text to network mining

Artificial Intelligence in Medicine
Rule-Based Assistance to Brain Tumour Diagnosis Using LR-FIR

KES '08 Proceedings of the 12th international conference on Knowledge-Based Intelligent Information and Engineering Systems, Part II
A Probabilistic mechanism based on clustering analysis and distance measure for subset gene selection

Expert Systems with Applications: An International Journal
Data Mining Using Rules Extracted from SVM: An Application to Churn Prediction in Bank Credit Cards

RSFDGrC '09 Proceedings of the 12th International Conference on Rough Sets, Fuzzy Sets, Data Mining and Granular Computing
Automated classification of dopaminergic neurons in the rodent brain

IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
Representation and feature selection using multiple kernel learning

IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
Improved wavelet neural network for early diagnosis of cancer patients using microarray gene expression data

IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
Support vector regression based hybrid rule extraction methods for forecasting

Expert Systems with Applications: An International Journal
Feature selection for SVM via optimization of kernel polarization with Gaussian ARD kernels

Expert Systems with Applications: An International Journal
Intelligible support vector machines for diagnosis of diabetes mellitus

IEEE Transactions on Information Technology in Biomedicine
Colon cancer prediction with genetics profiles using evolutionary techniques

Expert Systems with Applications: An International Journal
A weighted Lq adaptive least squares support vector machine classifiers - Robust and sparse approximation

Expert Systems with Applications: An International Journal
Rule extraction from support vector machines: A review

Neurocomputing
Rule extraction from support vector machine using modified active learning based approach: an application to CRM

KES'10 Proceedings of the 14th international conference on Knowledge-based and intelligent information and engineering systems: Part I
Evolution strategies based adaptive Lp LS-SVM

Information Sciences: an International Journal
Experiment specific expression patterns

RECOMB'11 Proceedings of the 15th Annual international conference on Research in computational molecular biology
Multiple-kernel SVM based multiple-task oriented data mining system for gene expression data analysis

Expert Systems with Applications: An International Journal
Evolution strategy based adaptive Lq penalty support vector machines with Gauss kernel for credit risk analysis

Applied Soft Computing
Building socioemotional environments in metaverses for virtual teams in healthcare: a conceptual exploration

HIS'12 Proceedings of the First international conference on Health Information Science
Distributed customer behavior prediction using multiplex data: A collaborative MK-SVM approach

Knowledge-Based Systems
Accurate Prediction of Coronary Artery Disease Using Reliable Diagnosis System

Journal of Medical Systems
Rule extraction from support vector machines based on consistent region covering reduction

Knowledge-Based Systems
Dynamic customer lifetime value prediction using longitudinal data: An improved multiple kernel SVR approach

Knowledge-Based Systems
PMBC: Pattern mining from biological sequences with wildcard constraints

Computers in Biology and Medicine
A note on hyper ellipse method for classifying biological and medical data

Computers in Biology and Medicine
Identification of glioma cancer-alerted gene markers based on a diagnostic outcome correlation analysis preferential approach

International Journal of Data Mining and Bioinformatics

Quantified Score

Hi-index	0.01

Visualization

Abstract

Objective: Recently, gene expression profiling using microarray techniques has been shown as a promising tool to improve the diagnosis and treatment of cancer. Gene expression data contain high level of noise and the overwhelming number of genes relative to the number of available samples. It brings out a great challenge for machine learning and statistic techniques. Support vector machine (SVM) has been successfully used to classify gene expression data of cancer tissue. In the medical field, it is crucial to deliver the user a transparent decision process. How to explain the computed solutions and present the extracted knowledge becomes a main obstacle for SVM. Material and methods: A multiple kernel support vector machine (MK-SVM) scheme, consisting of feature selection, rule extraction and prediction modeling is proposed to improve the explanation capacity of SVM. In this scheme, we show that the feature selection problem can be translated into an ordinary multiple parameters learning problem. And a shrinkage approach: 1-norm based linear programming is proposed to obtain the sparse parameters and the corresponding selected features. We propose a novel rule extraction approach using the information provided by the separating hyperplane and support vectors to improve the generalization capacity and comprehensibility of rules and reduce the computational complexity. Results and conclusion: Two public gene expression datasets: leukemia dataset and colon tumor dataset are used to demonstrate the performance of this approach. Using the small number of selected genes, MK-SVM achieves encouraging classification accuracy: more than 90% for both two datasets. Moreover, very simple rules with linguist labels are extracted. The rule sets have high diagnostic power because of their good classification performance.