Feature Selection: Evaluation, Application, and Small Sample Performance
IEEE Transactions on Pattern Analysis and Machine Intelligence
Nonlinear component analysis as a kernel eigenvalue problem
Neural Computation
Data mining methods for knowledge discovery
Data mining methods for knowledge discovery
Kernel PCA and de-noising in feature spaces
Proceedings of the 1998 conference on Advances in neural information processing systems II
A fuzzy c-means variant for the generation of fuzzy term sets
Fuzzy Sets and Systems - Theme: Modeling and learning
Kernel independent component analysis
The Journal of Machine Learning Research
An introduction to variable and feature selection
The Journal of Machine Learning Research
Feature extraction by non parametric mutual information maximization
The Journal of Machine Learning Research
Toward Integrating Feature Selection Algorithms for Classification and Clustering
IEEE Transactions on Knowledge and Data Engineering
Breast cancer diagnosis using genetic programming generated feature
Pattern Recognition
A new method to help diagnose cancers for small sample size
Expert Systems with Applications: An International Journal
Feature set decomposition for decision trees
Intelligent Data Analysis
Selection of relevant genes in cancer diagnosis based on their prediction accuracy
Artificial Intelligence in Medicine
Statistical Comparisons of Classifiers over Multiple Data Sets
The Journal of Machine Learning Research
Similarity classifier in diagnosis of bladder cancer
Computer Methods and Programs in Biomedicine
Genetic algorithm-based feature set partitioning for classification problems
Pattern Recognition
Medical data mining by fuzzy modeling with selected features
Artificial Intelligence in Medicine
Predicting breast cancer survivability: a comparison of three data mining methods
Artificial Intelligence in Medicine
Overview and recent advances in partial least squares
SLSFS'05 Proceedings of the 2005 international conference on Subspace, Latent Structure and Feature Selection
Artificial Intelligence in Medicine
Expert Systems with Applications: An International Journal
Expert Systems with Applications: An International Journal
A new hybrid intelligent system for accurate detection of Parkinson's disease
Computer Methods and Programs in Biomedicine
Hi-index | 0.02 |
Objective: Medical data sets are usually small and have very high dimensionality. Too many attributes will make the analysis less efficient and will not necessarily increase accuracy, while too few data will decrease the modeling stability. Consequently, the main objective of this study is to extract the optimal subset of features to increase analytical performance when the data set is small. Methods: This paper proposes a fuzzy-based non-linear transformation method to extend classification related information from the original data attribute values for a small data set. Based on the new transformed data set, this study applies principal component analysis (PCA) to extract the optimal subset of features. Finally, we use the transformed data with these optimal features as the input data for a learning tool, a support vector machine (SVM). Six medical data sets: Pima Indians' diabetes, Wisconsin diagnostic breast cancer, Parkinson disease, echocardiogram, BUPA liver disorders dataset, and bladder cancer cases in Taiwan, are employed to illustrate the approach presented in this paper. Results: This research uses the t-test to evaluate the classification accuracy for a single data set; and uses the Friedman test to show the proposed method is better than other methods over the multiple data sets. The experiment results indicate that the proposed method has better classification performance than either PCA or kernel principal component analysis (KPCA) when the data set is small, and suggest creating new purpose-related information to improve the analysis performance. Conclusion: This paper has shown that feature extraction is important as a function of feature selection for efficient data analysis. When the data set is small, using the fuzzy-based transformation method presented in this work to increase the information available produces better results than the PCA and KPCA approaches.