Machine Learning
An inductive learning method for medical diagnosis
Pattern Recognition Letters
Feature Selection for Machine Learning: Comparing a Correlation-Based Filter Approach to the Wrapper
Proceedings of the Twelfth International Florida Artificial Intelligence Research Society Conference
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Random Forests for multiclass classification: Random MultiNomial Logit
Expert Systems with Applications: An International Journal
Research of multi-population agent genetic algorithm for feature selection
Expert Systems with Applications: An International Journal
A study of cross-validation and bootstrap for accuracy estimation and model selection
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Feature elimination approach based on random forest for cancer diagnosis
MICAI'06 Proceedings of the 5th Mexican international conference on Artificial Intelligence
A novel approach to estimate proximity in a random forest: An exploratory study
Expert Systems with Applications: An International Journal
Hi-index | 0.00 |
Supervised classification algorithms are commonly used in the designing of computer-aided diagnosis systems. In this study, we present a resampling strategy based Random Forests (RF) ensemble classifier to improve diagnosis of cardiac arrhythmia. Random forests is an ensemble classifier that consists of many decision trees and outputs the class that is the mode of the class's output by individual trees. In this way, an RF ensemble classifier performs better than a single tree from classification performance point of view. In general, multiclass datasets having unbalanced distribution of sample sizes are difficult to analyze in terms of class discrimination. Cardiac arrhythmia is such a dataset that has multiple classes with small sample sizes and it is therefore adequate to test our resampling based training strategy. The dataset contains 452 samples in fourteen types of arrhythmias and eleven of these classes have sample sizes less than 15. Our diagnosis strategy consists of two parts: (i) a correlation based feature selection algorithm is used to select relevant features from cardiac arrhythmia dataset. (ii) RF machine learning algorithm is used to evaluate the performance of selected features with and without simple random sampling to evaluate the efficiency of proposed training strategy. The resultant accuracy of the classifier is found to be 90.0% and this is a quite high diagnosis performance for cardiac arrhythmia. Furthermore, three case studies, i.e., thyroid, cardiotocography and audiology, are used to benchmark the effectiveness of the proposed method. The results of experiments demonstrated the efficiency of random sampling strategy in training RF ensemble classification algorithm.