A practical approach to feature selection
ML92 Proceedings of the ninth international workshop on Machine learning
C4.5: programs for machine learning
C4.5: programs for machine learning
Estimating attributes: analysis and extensions of RELIEF
ECML-94 Proceedings of the European conference on machine learning on Machine Learning
Discriminant Adaptive Nearest Neighbor Classification
IEEE Transactions on Pattern Analysis and Machine Intelligence
Learning Decision Trees Using the Area Under the ROC Curve
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Distance Metrics for Instance-Bsed Learning
ISMIS '91 Proceedings of the 6th International Symposium on Methodologies for Intelligent Systems
Text Categorization Using Weight Adjusted k-Nearest Neighbor Classification
PAKDD '01 Proceedings of the 5th Pacific-Asia Conference on Knowledge Discovery and Data Mining
Generation of Similarity Measures from Different Sources
Proceedings of the 14th International conference on Industrial and engineering applications of artificial intelligence and expert systems: engineering of intelligent systems
A rank sum test method for informative gene discovery
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Selecting features in microarray classification using ROC curves
Pattern Recognition
Using weighted nearest neighbor to benefit from unlabeled data
PAKDD'06 Proceedings of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Large margin nearest neighbor classifiers
IEEE Transactions on Neural Networks
Hi-index | 0.00 |
The k-nearest neighbour (k-NN) technique, due to its interpretable nature, is a simple and very intuitively appealing method to address classification problems. However, choosing an appropriate distance function for k-NN can be challenging and an inferior choice can make the classifier highly vulnerable to noise in the data. In this paper, we propose a new method for determining a good distance function for k-NN. Our method is based on consideration of the area under the Receiver Operating Characteristics (ROC) curve, which is a well known method to measure the quality of binary classifiers. It computes weights for the distance function, based on ROC properties within an appropriate neighbourhood for the instances whose distance is being computed. We experimentally compare the effect of our scheme with a number of other well-known k-NN distance metrics, as well as with a range of different classifiers. Experiments show that our method can substantially boost the classification performance of the k-NN algorithm. Furthermore, in a number of cases our technique is even able to deliver better accuracy than state-of-the-art non k-NN classifiers, such as support vector machines.