Clustering high dimensional data: A graph-based relaxed optimization approach

Authors:
Chi-Hoon Lee;Osmar R. Zaïane;Ho-Hyun Park;Jiayuan Huang;Russell Greiner
Affiliations:
Computing Science Department, University of Alberta, Canada;Computing Science Department, University of Alberta, Canada;School of Electrical and Electronics Engineering, Chung Ang University, South Korea;School of Computer Science, University of Waterloo, Canada;Computing Science Department, University of Alberta, Canada
Venue:
Information Sciences: an International Journal
Year:
2008

Citing 25
Cited 24

Automatic subspace clustering of high dimensional data for data mining applications

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Density biased sampling: an improved method for data mining and clustering

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Normalized Cuts and Image Segmentation

IEEE Transactions on Pattern Analysis and Machine Intelligence
Gravity based spatial clustering

Proceedings of the 10th ACM international symposium on Advances in geographic information systems
Fast Global Optimization of Difficult Lennard-Jones Clusters

Computational Optimization and Applications
CLARANS: A Method for Clustering Objects for Spatial Data Mining

IEEE Transactions on Knowledge and Data Engineering
Optimal Grid-Clustering: Towards Breaking the Curse of Dimensionality in High-Dimensional Clustering

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Frequent term-based text clustering

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Clustering data streams

FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
Segmentation Using Eigenvectors: A Unifying View

ICCV '99 Proceedings of the International Conference on Computer Vision-Volume 2 - Volume 2
Kernel Methods for Pattern Analysis

Kernel Methods for Pattern Analysis
Learning the Kernel Matrix with Semidefinite Programming

The Journal of Machine Learning Research
Expanding self-organizing map for data visualization and cluster analysis

Information Sciences: an International Journal - Special issue: Soft computing data mining
Subspace Selection for Clustering High-Dimensional Data

ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
A database clustering methodology and tool

Information Sciences—Informatics and Computer Science: An International Journal
A Global Optimization RLT-based Approach for Solving the Hard Clustering Problem

Journal of Global Optimization
A Global Optimization RLT-based Approach for Solving the Fuzzy Clustering Problem

Journal of Global Optimization
A Combined Global & Local Search (CGLS) Approach to Global Optimization

Journal of Global Optimization
Clustering reduced interval data using Hausdorff distance

Computational Statistics
CrossClus: user-guided multi-relational clustering

Data Mining and Knowledge Discovery
Bipartite isoperimetric graph partitioning for data co-clustering

Data Mining and Knowledge Discovery
Tracking clusters in evolving data streams over sliding windows

Knowledge and Information Systems
Unsupervised segmentation of ultrasonic liver images by multiresolution fractal feature vector

Information Sciences: an International Journal
Support vector random fields for spatial classification

PKDD'05 Proceedings of the 9th European conference on Principles and Practice of Knowledge Discovery in Databases
Semidefinite clustering for image segmentation with a-priori knowledge

PR'05 Proceedings of the 27th DAGM conference on Pattern Recognition

Qualitative Chance Discovery - Extracting competitive advantages

Information Sciences: an International Journal
Error bounds of multi-graph regularized semi-supervised classification

Information Sciences: an International Journal
Exploiting noun phrases and semantic relationships for text document clustering

Information Sciences: an International Journal
A new point symmetry based fuzzy genetic clustering technique for automatic evolution of clusters

Information Sciences: an International Journal
Performance evaluation of density-based clustering methods

Information Sciences: an International Journal
Towards supporting expert evaluation of clustering results using a data mining process model

Information Sciences: an International Journal
A graph-theoretical clustering method based on two rounds of minimum spanning trees

Pattern Recognition
A new separation measure for improving the effectiveness of validity indices

Information Sciences: an International Journal
Kernel-induced fuzzy clustering of image pixels with an improved differential evolution algorithm

Information Sciences: an International Journal
Pairwise-adaptive dissimilarity measure for document clustering

Information Sciences: an International Journal
Incorporating multiple genomic features with the utilization of interacting domain patterns to improve the prediction of protein-protein interactions

Information Sciences: an International Journal
Validation of overlapping clustering: A random clustering perspective

Information Sciences: an International Journal
A time-efficient pattern reduction algorithm for k-means clustering

Information Sciences: an International Journal
Data clustering by minimizing disconnectivity

Information Sciences: an International Journal
An agglomerative clustering algorithm using a dynamic k-nearest-neighbor list

Information Sciences: an International Journal
Artificial immune multi-objective SAR image segmentation with fused complementary features

Information Sciences: an International Journal
Minimum spanning tree based split-and-merge: A hierarchical clustering method

Information Sciences: an International Journal
A nonparametric classification method based on K-associated graphs

Information Sciences: an International Journal
A clustering algorithm for multiple data streams based on spectral component similarity

Information Sciences: an International Journal
Improving feature space based image segmentation via density modification

Information Sciences: an International Journal
Fuzzy clustering based ET image fusion

Information Fusion
An automatic method to determine the number of clusters using decision-theoretic rough set

International Journal of Approximate Reasoning
Multi-objective evolutionary for synthetic aperture radar image segmentation with non-local means denoising

Natural Computing: an international journal
Detecting network communities using regularized spectral clustering algorithm

Artificial Intelligence Review

Quantified Score

Hi-index	0.07

Visualization

Abstract

There is no doubt that clustering is one of the most studied data mining tasks. Nevertheless, it remains a challenging problem to solve despite the many proposed clustering approaches. Graph-based approaches solve the clustering task as a global optimization problem, while many other works are based on local methods. In this paper, we propose a novel graph-based algorithm ''GBR'' that relaxes some well-defined method even as improving the accuracy whilst keeping it simple. The primary motivation of our relaxation of the objective is to allow the reformulated objective to find well distributed cluster indicators for complicated data instances. This relaxation results in an analytical solution that avoids the approximated iterative methods that have been adopted in many other graph-based approaches. The experiments on synthetic and real data sets show that our relaxation accomplishes excellent clustering results. Our key contributions are: (1) we provide an analytical solution to solve the global clustering task as opposed to approximated iterative approaches; (2) a very simple implementation using existing optimization packages; (3) an algorithm with relatively less computation time over the number of data instances to cluster than other well defined methods in the literature.