A Parameterless Method for Efficiently Discovering Clusters of Arbitrary Shape in Large Datasets

Authors:
Andrew Foss;Osmar R. Zaïane
Affiliations:
-;-
Venue:
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Year:
2002

Citing 0
Cited 6

Learning States and Rules for Detecting Anomalies in Time Series

Applied Intelligence
SMArTIC: towards building an accurate, robust and scalable specification miner

Proceedings of the 14th ACM SIGSOFT international symposium on Foundations of software engineering
Decision support in construction equipment management using a nonparametric outlier mining algorithm

Expert Systems with Applications: An International Journal
A nonparametric outlier detection for effectively discovering top-n outliers from engineering data

PAKDD'06 Proceedings of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
A new approach for cluster detection for large datasets with high dimensionality

DaWaK'05 Proceedings of the 7th international conference on Data Warehousing and Knowledge Discovery
Enhancing density-based clustering: Parameter reduction and outlier detection

Information Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Clustering is the problem of grouping data based on similarityand consists of maximizing the intra-group similaritywhile minimizing the inter-group similarity. The problem ofclustering data sets is also known as unsupervised classification,since no class labels are given. However, all exist-ingclustering algorithms require some parameters to steerthe clustering process, such as the famous k for the numberof expected clusters, which constitutes a supervision ofa sort. We present in this paper a new, efficient, fast andscalable clustering algorithm that clusters over a range ofresolutions and finds a potential optimum clustering withoutrequiring any parameter input. Our experiments showthat our algorithm outperforms most existing clustering algorithmsin quality and speed for large data sets.