A Data Complexity Analysis on Imbalanced Datasets and an Alternative Imbalance Recovering Strategy

Authors:
Cheng G. Weng;Josiah Poon
Affiliations:
The University of Sydney, Australia;The University of Sydney, Australia
Venue:
WI '06 Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence
Year:
2006

Citing 0
Cited 3

CODE: a data complexity framework for imbalanced datasets

PAKDD'09 Proceedings of the 13th Pacific-Asia international conference on Knowledge discovery and data mining: new frontiers in applied data mining
Effects of data set features on the performances of classification algorithms

Expert Systems with Applications: An International Journal
Enhancing short text clustering with small external repositories

AusDM '11 Proceedings of the Ninth Australasian Data Mining Conference - Volume 121

Quantified Score

Hi-index	0.00

Visualization

Abstract

The imbalance dataset problem arises in many domains, such as web page search, scam sites detection. In this paper, we propose an alternative re-sampling approach to deal with imbalance datasets. We demonstrate this approach with a concrete implementation and it has shown promising results when compared to other standard approaches that deals with imbalance dataset. We have also performed an analysis of the data complexity to help understand imbalanced dataset, which has also shown to be a promising approach.