Data Squashing for Speeding Up Boosting-Based Outlier Detection

Authors:
Shutaro Inatani;Einoshin Suzuki
Affiliations:
-;-
Venue:
ISMIS '02 Proceedings of the 13th International Symposium on Foundations of Intelligent Systems
Year:
2002

Citing 13
Cited 1

BIRCH: an efficient data clustering method for very large databases

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
From data mining to knowledge discovery: an overview

Advances in knowledge discovery and data mining
Squashing flat files flatter

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Discovery of fraud rules for telecommunications—challenges and solutions

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Machine learning and data mining

Communications of the ACM
Ubiquitous B-Tree

ACM Computing Surveys (CSUR)
Feature Selection for Knowledge Discovery and Data Mining

Feature Selection for Knowledge Discovery and Data Mining
Instance Selection and Construction for Data Mining

Instance Selection and Construction for Data Mining
Support Vector Machines for Knowledge Discovery

PKDD '99 Proceedings of the Third European Conference on Principles of Data Mining and Knowledge Discovery
Algorithms for Mining Distance-Based Outliers in Large Datasets

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Finding Intensional Knowledge of Distance-Based Outliers

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Distance-based outliers: algorithms and applications

The VLDB Journal — The International Journal on Very Large Data Bases
A brief introduction to boosting

IJCAI'99 Proceedings of the 16th international joint conference on Artificial intelligence - Volume 2

A boosting approach to remove class label noise

International Journal of Hybrid Intelligent Systems - Hybrid Intelligent systems in Ensembles

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we apply data squashing to speed up outlier detection based on boosting. One person's noise is another person's signal. Outlier detection is gaining increasing attention in data mining. In order to improve computational time for AdaBoost-based outlier detection, we beforehand compress a given data set based on a simplified method of BIRCH. Effectiveness of our approach in terms of detection accuracy and computational time is investigated by experiments with two real-world data sets of drug stores in Japan and an artificial data set of unlawful access to a computer network.