Isolating top-k dense regions with filtration of sparse background

Authors:
Ramkishore Bhattacharyya
Affiliations:
Microsoft India (R&D) Pvt. Ltd., Hyderabad 500 046, India
Venue:
Pattern Recognition Letters
Year:
2011

Citing 21
Cited 0

Algorithms for clustering data

Algorithms for clustering data
An automatic and stable clustering algorithm

Pattern Recognition Letters
CURE: an efficient clustering algorithm for large databases

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Density-Based Multiscale Data Condensation

IEEE Transactions on Pattern Analysis and Machine Intelligence
Density-Based Clustering in Spatial Databases: The Algorithm GDBSCAN and Its Applications

Data Mining and Knowledge Discovery
Refining Initial Points for K-Means Clustering

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Center CLICK: A Clustering Algorithm with Applications to Gene Expression Analysis

Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology
Clustering Transactional Data

PKDD '02 Proceedings of the 6th European Conference on Principles of Data Mining and Knowledge Discovery
DHC: A Density-Based Hierarchical Clustering Method for Time Series Gene Expression Data

BIBE '03 Proceedings of the 3rd IEEE Symposium on BioInformatics and BioEngineering
A needle in a haystack: local one-class optimization

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Cluster center initialization algorithm for K-means clustering

Pattern Recognition Letters
Efficiently Mining Gene Expression Data via a Novel Parameterless Clustering Method

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Robust one-class clustering using hybrid global and local search

ICML '05 Proceedings of the 22nd international conference on Machine learning
Iterative shrinking method for clustering problems

Pattern Recognition
A k-mean clustering algorithm for mixed numeric and categorical data

Data & Knowledge Engineering
In search of deterministic methods for initializing K-means and Gaussian mixture clustering

Intelligent Data Analysis
Bregman bubble clustering: A robust framework for mining dense clusters

ACM Transactions on Knowledge Discovery from Data (TKDD)
Divisive Correlation Clustering Algorithm (DCCA) for grouping of genes

Bioinformatics
Computation of initial modes for K-modes clustering algorithm using evidence accumulation

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Cohesion: A concept and framework for confident association discovery with potential application in microarray mining

Applied Soft Computing
Bioinformatics with soft computing

IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews

Quantified Score

Hi-index	0.10

Visualization

Abstract

The classical notion of clustering is to induce an equivalence class partition on a set of points, each class, being a homogeneous group, is called a cluster. Since it is an equivalence class partition, a point must belong to one and exactly one cluster. However in many applications, data distributions are such that only a subset of the points tends to flock under some distinct clusters while others go random. The present paper introduces an algorithm to find an optimal subset of points (ideally filtering out the random ones) with sufficient grouping tendency. It builds the neighborhood population around every point and picks up top k dense regions with possible reshuffling of points in post-processing. Performance of the algorithm is evaluated with applications onto real and simulated data. Comparative analysis on different quality indices with some other state-of-the-art algorithms establishes effectiveness of the approach.