Breast cancer diagnosis based on feature extraction using a hybrid of K-means and support vector machine algorithms

Authors:
Bichen Zheng;Sang Won Yoon;Sarah S. Lam
Affiliations:
-;-;-
Venue:
Expert Systems with Applications: An International Journal
Year:
2014

Citing 16
Cited 0

A note on genetic algorithms for large-scale feature selection

Pattern Recognition Letters
Symbolic clustering using a new dissimilarity measure

Pattern Recognition
Support-Vector Networks

Machine Learning
Feature Selection: Evaluation, Application, and Small Sample Performance

IEEE Transactions on Pattern Analysis and Machine Intelligence
Fast training of support vector machines using sequential minimal optimization

Advances in kernel methods
Data clustering: a review

ACM Computing Surveys (CSUR)
Data Mining: An Overview from a Database Perspective

IEEE Transactions on Knowledge and Data Engineering
A divisive information theoretic feature clustering algorithm for text classification

The Journal of Machine Learning Research
A support vector machine-based method for predicting the propensity of a protein to be soluble or to form inclusion body on overexpression in Escherichia coli

Bioinformatics
Prediction model building and feature selection with support vector machines in breast cancer diagnosis

Expert Systems with Applications: An International Journal
Breast cancer diagnosis using least square support vector machine

Digital Signal Processing
Support vector machines combined with feature selection for breast cancer diagnosis

Expert Systems with Applications: An International Journal
Data clustering: 50 years beyond K-means

Pattern Recognition Letters
SVM classifier based feature selection using GA, ACO and PSO for siRNA design

ICSI'10 Proceedings of the First international conference on Advances in Swarm Intelligence - Volume Part II
Survey of clustering algorithms

IEEE Transactions on Neural Networks
A pixel-based color image segmentation using support vector machine and fuzzy C-means

Neural Networks

Quantified Score

Hi-index	12.05

Visualization

Abstract

With the development of clinical technologies, different tumor features have been collected for breast cancer diagnosis. Filtering all the pertinent feature information to support the clinical disease diagnosis is a challenging and time consuming task. The objective of this research is to diagnose breast cancer based on the extracted tumor features. Feature extraction and selection are critical to the quality of classifiers founded through data mining methods. To extract useful information and diagnose the tumor, a hybrid of K-means and support vector machine (K-SVM) algorithms is developed. The K-means algorithm is utilized to recognize the hidden patterns of the benign and malignant tumors separately. The membership of each tumor to these patterns is calculated and treated as a new feature in the training model. Then, a support vector machine (SVM) is used to obtain the new classifier to differentiate the incoming tumors. Based on 10-fold cross validation, the proposed methodology improves the accuracy to 97.38%, when tested on the Wisconsin Diagnostic Breast Cancer (WDBC) data set from the University of California - Irvine machine learning repository. Six abstract tumor features are extracted from the 32 original features for the training phase. The results not only illustrate the capability of the proposed approach on breast cancer diagnosis, but also shows time savings during the training phase. Physicians can also benefit from the mined abstract tumor features by better understanding the properties of different types of tumors.