Fast algorithms for projected clustering

Authors:
Charu C. Aggarwal;Joel L. Wolf;Philip S. Yu;Cecilia Procopiuc;Jong Soo Park
Affiliations:
IBM T. J. Watson Research Center, Yorktown Heights, NY;IBM T. J. Watson Research Center, Yorktown Heights, NY;IBM T. J. Watson Research Center, Yorktown Heights, NY;Duke University, Durham, NC;Sungshin Women's University, Seoul, Korea
Venue:
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Year:
1999

Citing 13
Cited 214

Algorithms for clustering data

Algorithms for clustering data
Resource allocation problems: algorithmic approaches

Resource allocation problems: algorithmic approaches
Social information filtering: algorithms for automating “word of mouth”

CHI '95 Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
BIRCH: an efficient data clustering method for very large databases

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
A cost model for nearest neighbor search in high-dimensional data space

PODS '97 Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
CURE: an efficient clustering algorithm for large databases

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Automatic subspace clustering of high dimensional data for data mining applications

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
A comparative study of clustering methods

Future Generation Computer Systems - Special double issue on data mining
Knowledge Acquisition Via Incremental Conceptual Clustering

Machine Learning
A Distribution-Based Clustering Algorithm for Mining in Large Spatial Databases

ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
Clustering Categorical Data: An Approach Based on Dynamical Systems

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Efficient and Effective Clustering Methods for Spatial Data Mining

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Knowledge Discovery in Large Spatial Databases: Focusing Techniques for Efficient Class Identification

SSD '95 Proceedings of the 4th International Symposium on Advances in Spatial Databases

Scalable algorithms for mining large databases

KDD '99 Tutorial notes of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Horting hatches an egg: a new graph-theoretic approach to collaborative filtering

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Finding generalized projected clusters in high dimensional spaces

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
SQLEM: fast clustering in SQL using the EM algorithm

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Approximation algorithms for projective clustering

SODA '00 Proceedings of the eleventh annual ACM-SIAM symposium on Discrete algorithms
The IGrid index: reversing the dimensionality curse for similarity indexing in high dimensional space

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Clustering through decision tree construction

Proceedings of the ninth international conference on Information and knowledge management
Outlier detection for high dimensional data

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
A human-computer cooperative system for effective high dimensional clustering

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Towards effective and interpretable data mining by visual interaction

ACM SIGKDD Explorations Newsletter
The convex polyhedra technique: an index structure for high-dimensional space

ADC '02 Proceedings of the 13th Australasian database conference - Volume 5
Clustering by pattern similarity in large data sets

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
A Monte Carlo algorithm for fast projective clustering

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Transactional Multimedia Banner as Web Access Point

Electronic Commerce Research - Special issue on agents in electronic commerce
An iterative strategy for pattern discovery in high-dimensional data sets

Proceedings of the eleventh international conference on Information and knowledge management
Hyper-rectangle based segmentation and clustering of large video data sets

Information Sciences—Informatics and Computer Science: An International Journal - Special issue: Intelligent multimedia computing and networking
Projective ART for clustering data sets in high dimensional spaces

Neural Networks
Redefining Clustering for High-Dimensional Applications

IEEE Transactions on Knowledge and Data Engineering
Using Projections to Visually Cluster High-Dimensional Data

Computing in Science and Engineering
What Is the Nearest Neighbor in High Dimensional Spaces?

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Local Dimensionality Reduction: A New Approach to Indexing High Dimensional Spaces

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
CoFD: An Algorithm for Non-distance Based Clustering in High Dimensional Spaces

DaWaK 2000 Proceedings of the 4th International Conference on Data Warehousing and Knowledge Discovery
Data Reduction via Conflicting Data Analysis

ISMIS '00 Proceedings of the 12th International Symposium on Foundations of Intelligent Systems
Feature Selection for Clustering

PADKK '00 Proceedings of the 4th Pacific-Asia Conference on Knowledge Discovery and Data Mining, Current Issues and New Applications
Data Mining and Personalization Technologies

DASFAA '99 Proceedings of the Sixth International Conference on Database Systems for Advanced Applications
Subspace Clustering Based on Compressibility

DS '02 Proceedings of the 5th International Conference on Discovery Science
Approximation Algorithms for k-Line Center

ESA '02 Proceedings of the 10th Annual European Symposium on Algorithms
RecTree: An Efficient Collaborative Filtering Method

DaWaK '01 Proceedings of the Third International Conference on Data Warehousing and Knowledge Discovery
Approximation algorithms for projective clustering

Journal of Algorithms
Clustering binary data streams with K-means

DMKD '03 Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery
Clustering gene expression data in SQL using locally adaptive metrics

DMKD '03 Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery
ClusterTree: Integration of Cluster Representation and Nearest-Neighbor Search for Large Data Sets with High Dimensions

IEEE Transactions on Knowledge and Data Engineering
Analyzing High-Dimensional Data by Subspace Validity

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Frequent-Pattern based Iterative Projected Clustering

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
OP-Cluster: Clustering by Tendency in High Dimensional Space

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
MaPle: A Fast Algorithm for Maximal Pattern-based Clustering

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
A new distributed data mining model based on similarity

Proceedings of the 2003 ACM symposium on Applied computing
On Using Partial Supervision for Text Categorization

IEEE Transactions on Knowledge and Data Engineering
A Human-Computer Interactive Method for Projected Clustering

IEEE Transactions on Knowledge and Data Engineering
Using emerging pattern based projected clustering and gene expression data for cancer detection

APBC '04 Proceedings of the second conference on Asia-Pacific bioinformatics - Volume 29
Coordinating computational and visual approaches for interactive feature selection and multivariate clustering

Information Visualization - Special issue on coordinated and multiple views in exploratory visualization
Hypergraph Models and Algorithms for Data-Pattern-Based Clustering

Data Mining and Knowledge Discovery
Computing Clusters of Correlation Connected objects

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Subspace clustering for high dimensional data: a review

ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Document clustering via adaptive subspace iteration

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Efficient Disk-Based K-Means Clustering for Relational Databases

IEEE Transactions on Knowledge and Data Engineering
A framework for ontology-driven subspace clustering

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Sleeved coclustering

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
HARP: A Practical Projected Clustering Algorithm

IEEE Transactions on Knowledge and Data Engineering
Automatic image annotation and retrieval using subspace clustering algorithm

Proceedings of the 2nd ACM international workshop on Multimedia databases
Combining Partitional and Hierarchical Algorithms for Robust and Efficient Data Clustering with Cohesion Self-Merging

IEEE Transactions on Knowledge and Data Engineering
Iterative Projected Clustering by Subspace Mining

IEEE Transactions on Knowledge and Data Engineering
Identifying projected clusters from gene expression profiles

Journal of Biomedical Informatics
Subspace clustering for high dimensional categorical data

ACM SIGKDD Explorations Newsletter
CVA file: an index structure for high-dimensional datasets

Knowledge and Information Systems
Projective Clustering by Histograms

IEEE Transactions on Knowledge and Data Engineering
On Discovery of Extremely Low-Dimensional Clusters Using Semi-Supervised Projected Clustering

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
k-means projective clustering

PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
An effective and efficient algorithm for high-dimensional outlier detection

The VLDB Journal — The International Journal on Very Large Data Bases
Array-index: a plug&search K nearest neighbors method for high-dimensional data

Data & Knowledge Engineering
CURLER: finding and visualizing nonlinear correlation clusters

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
TRICLUSTER: an effective algorithm for mining coherent clusters in 3D microarray data

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
iDistance: An adaptive B+-tree based indexing method for nearest neighbor search

ACM Transactions on Database Systems (TODS)
Automatic Subspace Clustering of High Dimensional Data

Data Mining and Knowledge Discovery
Dimension induced clustering

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
A general model for clustering binary data

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Cross-relational clustering with user's guidance

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
A Shrinking-Based Clustering Approach for Multidimensional Data

IEEE Transactions on Knowledge and Data Engineering
A rank-by-feature framework for interactive exploration of multidimensional data

Information Visualization
A Generic Framework for Efficient Subspace Clustering of High-Dimensional Data

ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Categorization and Keyword Identification of Unlabeled Documents

ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Mining Quantitative Frequent Itemsets Using Adaptive Density-Based Subspace Clustering

ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Matrix approximation and projective clustering via volume sampling

SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
Automatic image annotation and retrieval using weighted feature selection

Multimedia Tools and Applications
MicroCluster: Efficient Deterministic Biclustering of Microarray Data

IEEE Intelligent Systems
Adherence clustering: an efficient method for mining market-basket clusters

Information Systems
A comprehensive comparison study of document clustering for a biomedical digital library MEDLINE

Proceedings of the 6th ACM/IEEE-CS joint conference on Digital libraries
Deriving quantitative models for correlation clusters

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Unsupervised learning on k-partite graphs

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Integration of semantic-based bipartite graph representation and mutual refinement strategy for biomedical literature clustering

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Projected clustering for categorical datasets

Pattern Recognition Letters
LinkClus: efficient clustering via heterogeneous semantic links

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Projective clustering using itemset discovery for multi-dimensional data analysis

MS'06 Proceedings of the 17th IASTED international conference on Modelling and simulation
Finding biclusters by random projections

Theoretical Computer Science
Cell-nuclear data reduction and prognostic model selection in bladder tumor recurrence

Artificial Intelligence in Medicine
A dimensionality reduction algorithm and its application for interactive visualization

Journal of Visual Languages and Computing
Constrained data clustering by depth control and progressive constraint relaxation

The VLDB Journal — The International Journal on Very Large Data Bases
An adaptive and dynamic dimensionality reduction method for high-dimensional indexing

The VLDB Journal — The International Journal on Very Large Data Bases
Locally adaptive metrics for clustering high dimensional data

Data Mining and Knowledge Discovery
NOCEA: A rule-based evolutionary algorithm for efficient and effective clustering of massive high-dimensional databases

Applied Soft Computing
Bi-criteria linear-time approximations for generalized k-mean/median/center

SCG '07 Proceedings of the twenty-third annual symposium on Computational geometry
Linear manifold clustering in high dimensional spaces by stochastic search

Pattern Recognition
Reverse Nearest Neighbors Search in Ad Hoc Subspaces

IEEE Transactions on Knowledge and Data Engineering
MESO: Supporting Online Decision Making in Autonomic Computing Systems

IEEE Transactions on Knowledge and Data Engineering
Toward Exploratory Test-Instance-Centered Diagnosis in High-Dimensional Classification

IEEE Transactions on Knowledge and Data Engineering
An Entropy Weighting k-Means Algorithm for Subspace Clustering of High-Dimensional Sparse Data

IEEE Transactions on Knowledge and Data Engineering
Xproj: a framework for projected structural clustering of xml documents

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Enhancing semi-supervised clustering: a feature projection perspective

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
A Sketch Algorithm for Estimating Two-Way and Multi-Way Associations

Computational Linguistics
Algorithms for clustering high dimensional and distributed data

Intelligent Data Analysis
Association-based similarity testing and its applications

Intelligent Data Analysis
Detecting eye fixations by projection clustering

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
A shrinking-based approach for multi-dimensional data analysis

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
A framework for projected clustering of high dimensional data streams

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Learning correlations using the mixture-of-subsets model

ACM Transactions on Knowledge Discovery from Data (TKDD)
A clustering framework based on subjective and objective validity criteria

ACM Transactions on Knowledge Discovery from Data (TKDD)
Continuous subspace clustering in streaming time series

Information Systems
Random walk biclustering for microarray data

Information Sciences: an International Journal
A convergence theorem for the fuzzy subspace clustering (FSC) algorithm

Pattern Recognition
Biomedical ontology improves biomedical literature clustering performance: a comparison study

International Journal of Bioinformatics Research and Applications
Finding non-redundant, statistically significant regions in high dimensional data: a novel approach to projected and subspace clustering

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Morpheus: interactive exploration of subspace clustering

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
High-Dimensional Clustering Method for High Performance Data Mining

ICCS '07 Proceedings of the 7th international conference on Computational Science, Part III: ICCS 2007
ELKI: A Software System for Evaluation of Subspace Clustering Algorithms

SSDBM '08 Proceedings of the 20th international conference on Scientific and Statistical Database Management
Pleiades: Subspace Clustering and Evaluation

ECML PKDD '08 Proceedings of the European conference on Machine Learning and Knowledge Discovery in Databases - Part II
Clustering based on matrix approximation: a unifying view

Knowledge and Information Systems
Constrained locally weighted clustering

Proceedings of the VLDB Endowment
Detecting clusters in moderate-to-high dimensional data: subspace clustering, pattern-based clustering, and correlation clustering

Proceedings of the VLDB Endowment
EDSC: efficient density-based subspace clustering

Proceedings of the 17th ACM conference on Information and knowledge management
Projective ART with buffers for the high dimensional space clustering and an application to discover stock associations

Neurocomputing
Clustering high-dimensional data: A survey on subspace clustering, pattern-based clustering, and correlation clustering

ACM Transactions on Knowledge Discovery from Data (TKDD)
Efficiently tracing clusters over high-dimensional on-line data streams

Data & Knowledge Engineering
Improving Accuracy of Recommender System by Item Clustering

IEICE - Transactions on Information and Systems
SLICE: A Novel Method to Find Local Linear Correlations by Constructing Hyperplanes

APWeb/WAIM '09 Proceedings of the Joint International Conferences on Advances in Data and Web Management
Clustering by pattern similarity

Journal of Computer Science and Technology
Query result clustering for object-level search

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
A semi-supervised approach to projected clustering with applications to microarray data

International Journal of Data Mining and Bioinformatics
Heidi matrix: nearest neighbor driven high dimensional data visualization

Proceedings of the ACM SIGKDD Workshop on Visual Analytics and Knowledge Discovery: Integrating Automated Analysis with Interactive Exploration
SubCOID: an attempt to explore cluster-outlier iterative detection approach to multi-dimensional data analysis in subspace

Proceedings of the 46th Annual Southeast Regional Conference on XX
K-Subspace Clustering

ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part II
Discovering pattern-based subspace clusters by pattern tree

Knowledge-Based Systems
Enhanced soft subspace clustering integrating within-cluster and between-cluster information

Pattern Recognition
Parallel clustering of high dimensional data by integrating multi-objective genetic algorithm with divide and conquer

Applied Intelligence
Subspace and projected clustering: experimental evaluation and analysis

Knowledge and Information Systems
Evaluating clustering in subspace projections of high dimensional data

Proceedings of the VLDB Endowment
Projected Gustafson Kessel Clustering

RSFDGrC '09 Proceedings of the 12th International Conference on Rough Sets, Fuzzy Sets, Data Mining and Granular Computing
Adherence clustering: an efficient method for mining market-basket clusters

Information Systems
SKM-SNP: SNP markers detection method

Journal of Biomedical Informatics
A fast algorithm for finding correlation clusters in noise data

PAKDD'07 Proceedings of the 11th Pacific-Asia conference on Advances in knowledge discovery and data mining
Clustering by random projections

ICDM'07 Proceedings of the 7th industrial conference on Advances in data mining: theoretical aspects and applications
Detection and visualization of subspace cluster hierarchies

DASFAA'07 Proceedings of the 12th international conference on Database systems for advanced applications
Exploring the power of heuristics and links in multi-relational data mining

ISMIS'08 Proceedings of the 17th international conference on Foundations of intelligent systems
High-dimensional indexing: transformational approaches to high-dimensional range and similarity searches

High-dimensional indexing: transformational approaches to high-dimensional range and similarity searches
Genetic algorithm-based high-dimensional data clustering technique

FSKD'09 Proceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 1
Distance based feature selection for clustering microarray data

DASFAA'08 Proceedings of the 13th international conference on Database systems for advanced applications
Mining Outliers in Correlated Subspaces for High Dimensional Data Sets

Fundamenta Informaticae - Intelligent Data Analysis in Granular Computing
Learning in parallel universes

Data Mining and Knowledge Discovery
Mixture models for learning low-dimensional roles in high-dimensional data

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Empirical comparison of techniques for automated failure diagnosis

SysML'08 Proceedings of the Third conference on Tackling computer systems problems with machine learning techniques
Automatic parameter determination in subspace clustering with gravitation function

Proceedings of the Fourteenth International Database Engineering & Applications Symposium
Mining relaxed closed subspace clusters

Proceedings of the 48th Annual Southeast Regional Conference
Towards improving subspace data analysis

Proceedings of the 48th Annual Southeast Regional Conference
Subspace clustering for indexing high dimensional data: a main memory index based on local reductions and individual multi-representations

Proceedings of the 14th International Conference on Extending Database Technology
An entropy weighting mixture model for subspace clustering of high-dimensional data

Pattern Recognition Letters
Advancing data clustering via projective clustering ensembles

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
A novel attribute weighting algorithm for clustering high-dimensional categorical data

Pattern Recognition
Projective clustering using neural networks with adaptive delay and signal transmission loss

Neural Computation
Projected Gustafson-Kessel clustering algorithm and its convergence

Transactions on rough sets XIV
Clustering very large multi-dimensional datasets with MapReduce

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Agent-based subspace clustering

PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part II
An extension of the PMML standard to subspace clustering models

Proceedings of the 2011 workshop on Predictive markup language modeling
Hybrid-LWM: A linear-model based hybrid clustering algorithm for supplier categorisation

International Journal of Systems, Control and Communications
A feature group weighting method for subspace clustering of high-dimensional data

Pattern Recognition
Efficient selectivity estimation by histogram construction based on subspace clustering

SSDBM'11 Proceedings of the 23rd international conference on Scientific and statistical database management
Learning from label proportions by optimizing cluster model selection

ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part III
Comparing apples and oranges: measuring differences between data mining results

ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part III
EEW-SC: Enhanced Entropy-Weighting Subspace Clustering for high dimensional gene expression data clustering analysis

Applied Soft Computing
Scalable density-based subspace clustering

Proceedings of the 20th ACM international conference on Information and knowledge management
External evaluation measures for subspace clustering

Proceedings of the 20th ACM international conference on Information and knowledge management
CLINCH: clustering incomplete high-dimensional data for data mining application

APWeb'05 Proceedings of the 7th Asia-Pacific web conference on Web Technologies Research and Development
Indexing text and visual features for WWW images

APWeb'05 Proceedings of the 7th Asia-Pacific web conference on Web Technologies Research and Development
Finding hierarchies of subspace clusters

PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases
A near-linear algorithm for projective clustering integer points

Proceedings of the twenty-third annual ACM-SIAM symposium on Discrete Algorithms
An incremental updating method for clustering-based high-dimensional data indexing

CIS'05 Proceedings of the 2005 international conference on Computational Intelligence and Security - Volume Part I
Set-Oriented dimension reduction: localizing principal component analysis via hidden markov models

CompLife'06 Proceedings of the Second international conference on Computational Life Sciences
Generalized projected clustering in high-dimensional data streams

APWeb'06 Proceedings of the 8th Asia-Pacific Web conference on Frontiers of WWW Research and Development
A fuzzy subspace algorithm for clustering high dimensional data

ADMA'06 Proceedings of the Second international conference on Advanced Data Mining and Applications
DHCC: Divisive hierarchical clustering of categorical data

Data Mining and Knowledge Discovery
An efficient clustering and indexing approach over large video sequences

PCM'06 Proceedings of the 7th Pacific Rim conference on Advances in Multimedia Information Processing
SSC: statistical subspace clustering

MLDM'05 Proceedings of the 4th international conference on Machine Learning and Data Mining in Pattern Recognition
Linear manifold clustering

MLDM'05 Proceedings of the 4th international conference on Machine Learning and Data Mining in Pattern Recognition
A grid-based clustering algorithm for high-dimensional data streams

ADMA'05 Proceedings of the First international conference on Advanced Data Mining and Applications
A new cell-based clustering method for high-dimensional data mining applications

KES'05 Proceedings of the 9th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part I
Density estimation for spatial data streams

SSTD'05 Proceedings of the 9th international conference on Advances in Spatial and Temporal Databases
Feature interaction in subspace clustering using the Choquet integral

Pattern Recognition
SC-tree: an efficient structure for high-dimensional data indexing

BNCOD'06 Proceedings of the 23rd British National Conference on Databases, conference on Flexible and Efficient Information Handling
Adaptive sampling and fast low-rank matrix approximation

APPROX'06/RANDOM'06 Proceedings of the 9th international conference on Approximation Algorithms for Combinatorial Optimization Problems, and 10th international conference on Randomization and Computation
Query-Based video event definition using rough set theory and high-dimensional representation

MMM'10 Proceedings of the 16th international conference on Advances in Multimedia Modeling
A robust seedless algorithm for correlation clustering

PAKDD'10 Proceedings of the 14th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I
Clustering in applications with multiple data sources-A mutual subspace clustering approach

Neurocomputing
Features selection from high-dimensional web data using clustering analysis

Proceedings of the 2nd International Conference on Web Intelligence, Mining and Semantics
Exploiting constraint inconsistence for dimension selection in subspace clustering: A semi-supervised approach

Neurocomputing
Subspace clustering

Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery
Mining of temporal coherent subspace clusters in multivariate time series databases

PAKDD'12 Proceedings of the 16th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I
Compression-aware I/O performance analysis for big data clustering

Proceedings of the 1st International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications
Density-Based projected clustering of data streams

SUM'12 Proceedings of the 6th international conference on Scalable Uncertainty Management
A New Locally Weighted K-Means for Cancer-Aided Microarray Data Analysis

Journal of Medical Systems
A survey on enhanced subspace clustering

Data Mining and Knowledge Discovery
On the equivalence of PLSI and projected clustering

ACM SIGMOD Record
Projective clustering ensembles

Data Mining and Knowledge Discovery
Color Image Segmentation: From the View of Projective Clustering

International Journal of Multimedia Data Engineering & Management
A weighting k-modes algorithm for subspace clustering of categorical data

Neurocomputing
Interactive data mining with 3D-parallel-coordinate-trees

Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Novel soft subspace clustering with multi-objective evolutionary approach for high-dimensional data

Pattern Recognition
Outlier ensembles: position paper

ACM SIGKDD Explorations Newsletter
Fuzzy partition based soft subspace clustering and its applications in high dimensional data

Information Sciences: an International Journal
Finding contexts of social influence in online social networks

Proceedings of the 7th Workshop on Social Network Mining and Analysis
GPUMAFIA: efficient subspace clustering with MAFIA on GPUs

Euro-Par'13 Proceedings of the 19th international conference on Parallel Processing
Mining order-preserving submatrices from probabilistic matrices

ACM Transactions on Database Systems (TODS)
Finding multiple global linear correlations in sparse and noisy data sets

Knowledge-Based Systems
Hybrid entity clustering using crowds and data

The VLDB Journal — The International Journal on Very Large Data Bases
A multivariate fuzzy system applied for outliers detection

Journal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology
Semi-supervised projected model-based clustering

Data Mining and Knowledge Discovery
Subspace clustering of high-dimensional data: an evolutionary approach

Applied Computational Intelligence and Soft Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

The clustering problem is well known in the database literature for its numerous applications in problems such as customer segmentation, classification and trend analysis. Unfortunately, all known algorithms tend to break down in high dimensional spaces because of the inherent sparsity of the points. In such high dimensional spaces not all dimensions may be relevant to a given cluster. One way of handling this is to pick the closely correlated dimensions and find clusters in the corresponding subspace. Traditional feature selection algorithms attempt to achieve this. The weakness of this approach is that in typical high dimensional data mining applications different sets of points may cluster better for different subsets of dimensions. The number of dimensions in each such cluster-specific subspace may also vary. Hence, it may be impossible to find a single small subset of dimensions for all the clusters. We therefore discuss a generalization of the clustering problem, referred to as the projected clustering problem, in which the subsets of dimensions selected are specific to the clusters themselves. We develop an algorithmic framework for solving the projected clustering problem, and test its performance on synthetic data.