Instance-Based Learning Algorithms
Machine Learning
Prototype selection for the nearest neighbour rule through proximity graphs
Pattern Recognition Letters
Reduction Techniques for Instance-BasedLearning Algorithms
Machine Learning
Machine Learning
Advances in Instance Selection for Instance-Based Learning Algorithms
Data Mining and Knowledge Discovery
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
On the Consistency of Information Filters for Lazy Learning Algorithms
PKDD '99 Proceedings of the Third European Conference on Principles of Data Mining and Knowledge Discovery
Separability Index in Supervised Learning
PKDD '02 Proceedings of the 6th European Conference on Principles of Data Mining and Knowledge Discovery
Reference Set Thinning for the k-Nearest Neighbor Decision Rule
ICPR '98 Proceedings of the 14th International Conference on Pattern Recognition-Volume 1 - Volume 1
Application of computational geometry to pattern recognition problems
Application of computational geometry to pattern recognition problems
A new family of proximity graphs: class cover catch digraphs
Discrete Applied Mathematics
Evolution of Networks: From Biological Nets to the Internet and WWW (Physics)
Evolution of Networks: From Biological Nets to the Internet and WWW (Physics)
Prototype selection for dissimilarity-based classifiers
Pattern Recognition
A fast all nearest neighbor algorithm for applications involving large point-clouds
Computers and Graphics
Neighborhood Property--Based Pattern Selection for Support Vector Machines
Neural Computation
Statistical Comparisons of Classifiers over Multiple Data Sets
The Journal of Machine Learning Research
Fast Nearest Neighbor Condensation for Large Data Sets Classification
IEEE Transactions on Knowledge and Data Engineering
Avoiding Boosting Overfitting by Removing Confusing Samples
ECML '07 Proceedings of the 18th European conference on Machine Learning
Geometric decision rules for instance-based learning problems
PReMI'05 Proceedings of the First international conference on Pattern Recognition and Machine Intelligence
Graph-Based Discrete Differential Geometry for Critical Instance Filtering
ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part II
A class boundary preserving algorithm for data condensation
Pattern Recognition
A novel two-stage phased modeling framework for early fraud detection in online auctions
Expert Systems with Applications: An International Journal
An instance selection algorithm based on reverse nearest neighbor
PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part I
Profiling instances in noise reduction
Knowledge-Based Systems
FRPS: A Fuzzy Rough Prototype Selection method
Pattern Recognition
ATISA: Adaptive Threshold-based Instance Selection Algorithm
Expert Systems with Applications: An International Journal
Prototype reduction based on Direct Weighted Pruning
Pattern Recognition Letters
Hi-index | 0.00 |
In supervised learning, a training set consisting of labeled instances is used by a learning algorithm for generating a model (classifier) that is subsequently employed for deciding the class label of new instances (for generalization). Characteristics of the training set, such as presence of noisy instances and size, influence the learning algorithm and affect generalization performance. This paper introduces a new network-based representation of a training set, called hit miss network (HMN), which provides a compact description of the nearest neighbor relation over pairs of instances from each pair of classes. We show that structural properties of HMN's correspond to properties of training points related to the one nearest neighbor (1-NN) decision rule, such as being border or central point. This motivates us to use HMN's for improving the performance of a 1-NN, classifier by removing instances from the training set (instance selection). We introduce three new HMN-based algorithms for instance selection. HMN-C, which removes instances without affecting accuracy of 1-NN on the original training set, HMN-E, based on a more aggressive storage reduction, and HMN-EI, which applies iteratively HMN-E. Their performance is assessed on 22 data sets with different characteristics, such as input dimension, cardinality, class balance, number of classes, noise content, and presence of redundant variables. Results of experiments on these data sets show that accuracy of 1-NN classifier increases significantly when HMN-EI is applied. Comparison with state-of-the-art editing algorithms for instance selection on these data sets indicates best generalization performance of HMN-EI and no significant difference in storage requirements. In general, these results indicate that HMN's provide a powerful graph-based representation of a training set, which can be successfully applied for performing noise and redundance reduction in instance-based learning.