Selection of relevant features and examples in machine learning
Artificial Intelligence - Special issue on relevance
Wrappers for feature subset selection
Artificial Intelligence - Special issue on relevance
Reduced feature-set based parallel CHMM speech recognition systems
Information Sciences—Informatics and Computer Science: An International Journal - Special issue: Spoken language analysis, modeling and recognition-statistical and adaptive connectionist approaches
Information Sciences: an International Journal
IEEE Transactions on Information Technology in Biomedicine
IScIDE'11 Proceedings of the Second Sino-foreign-interchange conference on Intelligent Science and Intelligent Data Engineering
Journal of Biomedical Informatics
Expert Systems with Applications: An International Journal
Similarity classifier with ordered weighted averaging operators
Expert Systems with Applications: An International Journal
Feature subset selection Filter-Wrapper based on low quality data
Expert Systems with Applications: An International Journal
A hybrid intelligent system for medical data classification
Expert Systems with Applications: An International Journal
A new hybrid intelligent system for accurate detection of Parkinson's disease
Computer Methods and Programs in Biomedicine
Hi-index | 12.06 |
Feature selection plays an important role in classification for several reasons. First it can simplify the model and this way computational cost can be reduced and also when the model is taken for practical use fewer inputs are needed which means in practice that fewer measurements from new samples are needed. Second by removing insignificant features from the data set one can also make the model more transparent and more comprehensible, providing better explanation of suggested diagnosis, which is an important requirement in medical applications. Feature selection process can also reduce noise and this way enhance the classification accuracy. In this article, feature selection method based on fuzzy entropy measures is introduced and it is tested together with similarity classifier. Model was tested with four medical data sets which were, dermatology, Pima-Indian diabetes, breast cancer and Parkinsons data set. With all the four data sets, we managed to get quite good results by using fewer features that in the original data sets. Also with Parkinsons and dermatology data sets, classification accuracy was managed to enhance significantly this way. Mean classification accuracy with Parkinsons data set being 85.03% with only two features from original 22. With dermatology data set, mean accuracy of 98.28% was achieved using 29 features instead of 34 original features. Results can be considered quite good.