Robust weighted kernel logistic regression in imbalanced and rare events data

Authors:
Maher Maalouf;Theodore B. Trafalis
Affiliations:
-;-
Venue:
Computational Statistics & Data Analysis
Year:
2011

Citing 17
Cited 1

The nature of statistical learning theory

The nature of statistical learning theory
Machine Learning for the Detection of Oil Spills in Satellite Radar Images

Machine Learning - Special issue on applications of machine learning and the knowledge discovery process
Multiple Comparisons in Induction Algorithms

Machine Learning
An introduction to support Vector Machines: and other kernel-based learning methods

An introduction to support Vector Machines: and other kernel-based learning methods
Modeling for Optimal Probability Prediction

ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Kernel Methods for Pattern Analysis

Kernel Methods for Pattern Analysis
Mining with rarity: a unifying framework

ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Learning and evaluating classifiers under sample selection bias

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Logistic regression for data mining and high-dimensional classification

Logistic regression for data mining and high-dimensional classification
Data Mining and Knowledge Discovery Handbook

Data Mining and Knowledge Discovery Handbook
A comparison of algorithms for maximum entropy parameter estimation

COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
Trust region Newton methods for large-scale logistic regression

Proceedings of the 24th international conference on Machine learning
Experimental perspectives on learning from imbalanced data

Proceedings of the 24th international conference on Machine learning
Mining Data with Rare Events: A Case Study

ICTAI '07 Proceedings of the 19th IEEE International Conference on Tools with Artificial Intelligence - Volume 02
Prediction of Landslide Susceptibility Using logistic Regression: A Case Study in Bailongjiang River Basin, China

FSKD '08 Proceedings of the 2008 Fifth International Conference on Fuzzy Systems and Knowledge Discovery - Volume 04
Tornado detection with support vector machines

ICCS'03 Proceedings of the 2003 international conference on Computational science
Knowledge discovery approach to automated cardiac SPECT diagnosis

Artificial Intelligence in Medicine

Video genre classification using weighted kernel logistic regression

Advances in Multimedia - Special issue on Multimedia Applications for Smart Device in Ubiquitous Environments

Quantified Score

Hi-index	0.03

Visualization

Abstract

Recent developments in computing and technology, along with the availability of large amounts of raw data, have contributed to the creation of many effective techniques and algorithms in the fields of pattern recognition and machine learning. The main objectives for developing these algorithms include identifying patterns within the available data or making predictions, or both. Great success has been achieved with many classification techniques in real-life applications. With regard to binary data classification in particular, analysis of data containing rare events or disproportionate class distributions poses a great challenge to industry and to the machine learning community. This study examines rare events (REs) with binary dependent variables containing many more non-events (zeros) than events (ones). These variables are difficult to predict and to explain as has been evidenced in the literature. This research combines rare events corrections to Logistic Regression (LR) with truncated Newton methods and applies these techniques to Kernel Logistic Regression (KLR). The resulting model, Rare Event Weighted Kernel Logistic Regression (RE-WKLR), is a combination of weighting, regularization, approximate numerical methods, kernelization, bias correction, and efficient implementation, all of which are critical to enabling RE-WKLR to be an effective and powerful method for predicting rare events. Comparing RE-WKLR to SVM and TR-KLR, using non-linearly separable, small and large binary rare event datasets, we find that RE-WKLR is as fast as TR-KLR and much faster than SVM. In addition, according to the statistical significance test, RE-WKLR is more accurate than both SVM and TR-KLR.