A rule-based scheme for filtering examples from majority class in an imbalanced training set

Authors:
Jamshid Dehmeshki;Mustafa Karaköy;Manlio Valdivieso Casique
Affiliations:
Medicsight Plc., London, England;Medicsight Plc., London, England;Medicsight Plc., London, England
Venue:
MLDM'03 Proceedings of the 3rd international conference on Machine learning and data mining in pattern recognition
Year:
2003

Citing 3
Cited 5

Unifying instance-based and rule-based induction

Machine Learning
Neural Networks for Pattern Recognition

Neural Networks for Pattern Recognition
Learning When Negative Examples Abound

ECML '97 Proceedings of the 9th European Conference on Machine Learning

Task decomposition and modular single-hidden-layer perceptron classifiers for multi-class learning problems

Pattern Recognition
An approach to mining the multi-relational imbalanced database

Expert Systems with Applications: An International Journal
Parallel-series perceptrons for the simultaneous determination of odor classes and concentrations

ICANN'07 Proceedings of the 17th international conference on Artificial neural networks
A hierarchical shrinking decision tree for imbalanced datasets

DNCOCO'06 Proceedings of the 5th WSEAS international conference on Data networks, communications and computers
Classification and outlier detection based on topic based pattern synthesis

MLDM'13 Proceedings of the 9th international conference on Machine Learning and Data Mining in Pattern Recognition

Quantified Score

Hi-index	0.00

Visualization

Abstract

Developing a Computer-Assisted Detection (CAD) system for automatic diagnosis of pulmonary nodules in thoracic CT is a highly challenging research area in the medical domain. It requires a successful application of quite sophisticated, state-of-the-art image processing and pattern recognition technologies. The object recognition and feature extraction phase of such a system generates a huge imbalanced training set, as is the case in many learning problems in medical domain. The performance of concept learning systems is traditionally assessed with the percentage of testing examples classified correctly, termed as accuracy. This accuracy measurement becomes inappropriate for imbalanced training sets like in this case, where the nonnodules (negative) examples outnumber nodule (positive) examples. This paper introduces the mechanism developed for filtering negative examples in the training so as to remove 'obvious' ones, and discusses alternative evaluation criteria.