Bioinformatics—an introduction for computer scientists
ACM Computing Surveys (CSUR)
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Computational Methods of Feature Selection (Chapman & Hall/Crc Data Mining and Knowledge Discovery Series)
Statistical Comparisons of Classifiers over Multiple Data Sets
The Journal of Machine Learning Research
Applied Survival Analysis: Regression Modeling of Time to Event Data
Applied Survival Analysis: Regression Modeling of Time to Event Data
RSCTC'2010 discovery challenge: mining DNA microarray data for medical diagnosis and treatment
RSCTC'10 Proceedings of the 7th international conference on Rough sets and current trends in computing
Comparison of reuse strategies for case-based classification in bioinformatics
ICCBR'11 Proceedings of the 19th international conference on Case-Based Reasoning Research and Development
Application of classification algorithms on IDDM rat data
ICDM'12 Proceedings of the 12th Industrial conference on Advances in Data Mining: applications and theoretical aspects
Hi-index | 0.00 |
Bioinformatics datasets are often used to compare classification algorithms for highly dimensional data. Since genetic data are becoming more and more routinely used in medical settings, researchers and life scientists alike are interested in answering such questions as finding the gene signature of a disease, classifying data for diagnosis, or evaluating the severity of a disease. Since many different types of algorithms have been applied to this domain, often with comparable, although slightly different, results, it may be cumbersome to determine which one to use and how to make this determination. Therefore this paper proposes to study, on some of the most benchmarked datasets in bioinformatics, the performance of K-nearest-neighbor and related case-based classification algorithms in order to make methodological recommendations for applying these algorithms to this domain. In conclusion, K-nearest-neighbor classifiers perform as or among the best in combination with feature selection methods.