Machine Learning
Multiple feature sets based categorization of laryngeal images
Computer Methods and Programs in Biomedicine
Automated speech analysis applied to laryngeal disease categorization
Computer Methods and Programs in Biomedicine
Expert Systems with Applications: An International Journal
Customer churn prediction using improved balanced random forests
Expert Systems with Applications: An International Journal
Selecting features from multiple feature sets for SVM committee-based screening of human larynx
Expert Systems with Applications: An International Journal
Mining data with random forests: A survey and results of new tests
Pattern Recognition
Hi-index | 12.06 |
This paper is concerned with soft computing techniques-based noninvasive monitoring of human larynx using subject's questionnaire data. By applying random forests (RF), questionnaire data are categorized into a healthy class and several classes of disorders including: cancerous, noncancerous, diffuse, nodular, paralysis, and an overall pathological class. The most important questionnaire statements are determined using RF variable importance evaluations. To explore data represented by variables used by RF, the t-distributed stochastic neighbor embedding (t-SNE) and the multidimensional scaling (MDS) are applied to the RF data proximity matrix. When testing the developed tools on a set of data collected from 109 subjects, the 100% classification accuracy was obtained on unseen data in binary classification into the healthy and pathological classes. The accuracy of 80.7% was achieved when classifying the data into the healthy, cancerous, noncancerous classes. The t-SNE and MDS mapping techniques applied allow obtaining two-dimensional maps of data and facilitate data exploration aimed at identifying subjects belonging to a ''risk group''. It is expected that the developed tools will be of great help in preventive health care in laryngology.