Algorithms for clustering data
Algorithms for clustering data
Social information filtering: algorithms for automating “word of mouth”
CHI '95 Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Selection of relevant features and examples in machine learning
Artificial Intelligence - Special issue on relevance
Wrappers for feature subset selection
Artificial Intelligence - Special issue on relevance
Estimating an Eigenvector by the Power Method with a Random Start
SIAM Journal on Matrix Analysis and Applications
Automatic subspace clustering of high dimensional data for data mining applications
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
A comparative study of clustering methods
Future Generation Computer Systems - Special double issue on data mining
Database techniques for the World-Wide Web: a survey
ACM SIGMOD Record
Fast algorithms for projected clustering
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Entropy-based subspace clustering for mining numerical data
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Data mining: practical machine learning tools and techniques with Java implementations
Data mining: practical machine learning tools and techniques with Java implementations
ACM Computing Surveys (CSUR)
Finding generalized projected clusters in high dimensional spaces
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Efficient clustering of high-dimensional data sets with application to reference matching
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Feature selection in unsupervised learning via evolutionary search
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Clustering through decision tree construction
Proceedings of the ninth international conference on Information and knowledge management
Data mining: concepts and techniques
Data mining: concepts and techniques
Database-friendly random projections
PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Outlier detection for high dimensional data
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Dimensionality Reduction in Unsupervised Learning of Conditional Gaussian Networks
IEEE Transactions on Pattern Analysis and Machine Intelligence
Random projection in dimensionality reduction: applications to image and text data
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Unsupervised Feature Selection Using Feature Similarity
IEEE Transactions on Pattern Analysis and Machine Intelligence
A new cell-based clustering method for large, high-dimensional data in data mining applications
Proceedings of the 2002 ACM symposium on Applied computing
Feature Selection for Knowledge Discovery and Data Mining
Feature Selection for Knowledge Discovery and Data Mining
A Monte Carlo algorithm for fast projective clustering
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Cluster validity methods: part I
ACM SIGMOD Record
Microarrays for an Integrative Genomics
Microarrays for an Integrative Genomics
An iterative strategy for pattern discovery in high-dimensional data sets
Proceedings of the eleventh international conference on Information and knowledge management
COOLCAT: an entropy-based algorithm for categorical clustering
Proceedings of the eleventh international conference on Information and knowledge management
Clustering validity checking methods: part II
ACM SIGMOD Record
Using Projections to Visually Cluster High-Dimensional Data
Computing in Science and Engineering
Efficient Feature Selection in Conceptual Clustering
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Feature Subset Selection and Order Identification for Unsupervised Learning
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
X-means: Extending K-means with Efficient Estimation of the Number of Clusters
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Optimal Grid-Clustering: Towards Breaking the Curse of Dimensionality in High-Dimensional Clustering
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Efficient and Effective Clustering Methods for Spatial Data Mining
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
An Evaluation of Criteria for Measuring the Quality of Clusters
IJCAI '99 Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence
'1 + 1 2': Merging Distance and Density Based Clustering
DASFAA '01 Proceedings of the 7th International Conference on Database Systems for Advanced Applications
FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
O-Cluster: Scalable Clustering of Large High Dimensional Data Sets
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Feature Selection for Clustering - A Filter Solution
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Adaptive dimension reduction for clustering high dimensional data
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Efficient Algorithm for Projected Clustering
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
d-Clusters: Capturing Subspace Correlation in a Large Data Set
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Towards Meaningful High-Dimensional Nearest Neighbor Search by Human-Computer Interaction
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Dimensionality Reduction of Unsupervised Data
ICTAI '97 Proceedings of the 9th International Conference on Tools with Artificial Intelligence
Experiments with random projections for machine learning
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Algorithms for clustering high dimensional and distributed data
Intelligent Data Analysis
Dependency-based feature selection for clustering symbolic data
Intelligent Data Analysis
Toward Integrating Feature Selection Algorithms for Classification and Clustering
IEEE Transactions on Knowledge and Data Engineering
A hybrid approach for multiresolution modeling of large-scale scientific data
Proceedings of the 2005 ACM symposium on Applied computing
A Generic Framework for Efficient Subspace Clustering of High-Dimensional Data
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
IEEE Intelligent Systems
Comparing Subspace Clusterings
IEEE Transactions on Knowledge and Data Engineering
Blocking objectionable web content by leveraging multiple information sources
ACM SIGKDD Explorations Newsletter
Locally adaptive metrics for clustering high dimensional data
Data Mining and Knowledge Discovery
Linear manifold clustering in high dimensional spaces by stochastic search
Pattern Recognition
Fuzzy clustering in parallel universes
International Journal of Approximate Reasoning
An Entropy Weighting k-Means Algorithm for Subspace Clustering of High-Dimensional Sparse Data
IEEE Transactions on Knowledge and Data Engineering
Adaptive dimension reduction using discriminant analysis and K-means clustering
Proceedings of the 24th international conference on Machine learning
Finding low-entropy sets and trees from binary data
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Enhancing semi-supervised clustering: a feature projection perspective
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
A New Approach of Data Clustering Using a Flock of Agents
Evolutionary Computation
Using association patterns for discrete-valed data clustering
AIAP'07 Proceedings of the 25th conference on Proceedings of the 25th IASTED International Multi-Conference: artificial intelligence and applications
Comparison between two coevolutionary feature weighting algorithms in clustering
Pattern Recognition
Top-Down Parameter-Free Clustering of High-Dimensional Categorical Data
IEEE Transactions on Knowledge and Data Engineering
A genetic approach for efficient outlier detection in projected space
Pattern Recognition
Mining approximate top-k subspace anomalies in multi-dimensional time-series data
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Exploitation of a parallel clustering algorithm on commodity hardware with P2P-MPI
The Journal of Supercomputing
Maximal Subspace Coregulated Gene Clustering
IEEE Transactions on Knowledge and Data Engineering
VISA: visual subspace clustering analysis
ACM SIGKDD Explorations Newsletter - Special issue on visual analytics
A hierarchical model-based approach to co-clustering high-dimensional data
Proceedings of the 2008 ACM symposium on Applied computing
Sampling cube: a framework for statistical olap over sampling data
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
SS-ClusterTree: a subspace clustering based indexing algorithm over high-dimensional image features
CIVR '08 Proceedings of the 2008 international conference on Content-based image and video retrieval
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Focused local cluster formation for multidimensional microarray data
AEE'08 Proceedings of the 7th WSEAS International Conference on Application of Electrical Engineering
Multimedia Tools and Applications
Pleiades: Subspace Clustering and Evaluation
ECML PKDD '08 Proceedings of the European conference on Machine Learning and Knowledge Discovery in Databases - Part II
Evaluation of Subspace Clustering Quality
HAIS '08 Proceedings of the 3rd international workshop on Hybrid Artificial Intelligence Systems
UNSUPERVISED ANOMALY DETECTION IN LARGE DATABASES USING BAYESIAN NETWORKS
Applied Artificial Intelligence
Constrained locally weighted clustering
Proceedings of the VLDB Endowment
Proceedings of the VLDB Endowment
Weighted cluster ensembles: Methods and analysis
ACM Transactions on Knowledge Discovery from Data (TKDD)
ACM Transactions on Knowledge Discovery from Data (TKDD)
Projected outlier detection in high-dimensional mixed-attributes data set
Expert Systems with Applications: An International Journal
Clustering of document collection - A weighting approach
Expert Systems with Applications: An International Journal
A Clustering Method for Improving Performance of Anomaly-Based Intrusion Detection System
IEICE - Transactions on Information and Systems
A scalable framework for discovering coherent co-clusters in noisy data
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Query result clustering for object-level search
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Robust partitional clustering by outlier and density insensitive seeding
Pattern Recognition Letters
Subspace sums for extracting non-random data from massive noise
Knowledge and Information Systems
Evolutionary clustering with arbitrary subspaces
Proceedings of the 11th Annual conference on Genetic and evolutionary computation
A Semi-supervised Topic-Driven Approach for Clustering Textual Answers to Survey Questions
ADMA '09 Proceedings of the 5th International Conference on Advanced Data Mining and Applications
MACs: Multi-Attribute Co-clusters with High Correlation Information
ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part II
ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part II
History Guided Low-Cost Change Detection in Streams
DaWaK '09 Proceedings of the 11th International Conference on Data Warehousing and Knowledge Discovery
Discovering pattern-based subspace clusters by pattern tree
Knowledge-Based Systems
Detection of orthogonal concepts in subspaces of high dimensional data
Proceedings of the 18th ACM conference on Information and knowledge management
Rank-aware clustering of structured datasets
Proceedings of the 18th ACM conference on Information and knowledge management
A pattern-based outlier detection method identifying abnormal attributes in software project data
Information and Software Technology
Generalized cluster aggregation
IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Subspace and projected clustering: experimental evaluation and analysis
Knowledge and Information Systems
Using trees to depict a forest
Proceedings of the VLDB Endowment
Anchor text extraction for academic search
NLPIR4DL '09 Proceedings of the 2009 Workshop on Text and Citation Analysis for Scholarly Digital Libraries
Advanced Techniques in CBIR: Local Descriptors, Visual Dictionaries and Bags of Features
SIBGRAPI-TUTORIALS '09 Proceedings of the 2009 Tutorials of the XXII Brazilian Symposium on Computer Graphics and Image Processing
SKM-SNP: SNP markers detection method
Journal of Biomedical Informatics
A novel relative space based gene feature extraction and cancer recognition
PAKDD'07 Proceedings of the 11th Pacific-Asia conference on Advances in knowledge discovery and data mining
Mining time-shifting co-regulation patterns from gene expression data
APWeb/WAIM'07 Proceedings of the joint 9th Asia-Pacific web and 8th international conference on web-age information management conference on Advances in data and web management
DASFAA'07 Proceedings of the 12th international conference on Database systems for advanced applications
Mining quality-aware subspace clusters
PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
Actionability and formal concepts: a data mining perspective
ICFCA'08 Proceedings of the 6th international conference on Formal concept analysis
Clustering high dimensional data streams with representative points
FSKD'09 Proceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 1
Genetic algorithm-based high-dimensional data clustering technique
FSKD'09 Proceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 1
Scalable Clustering for Mining Local-Correlated Clusters in High Dimensions and Large Datasets
Fundamenta Informaticae - Intelligent Data Analysis in Granular Computing
Mining Outliers in Correlated Subspaces for High Dimensional Data Sets
Fundamenta Informaticae - Intelligent Data Analysis in Granular Computing
Subspace clustering of images using ant colony optimisation
ICIP'09 Proceedings of the 16th IEEE international conference on Image processing
Hierarchical document clustering using local patterns
Data Mining and Knowledge Discovery
Automatic malware categorization using cluster ensemble
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Negative correlations in collaboration: concepts and algorithms
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Learning multiple nonredundant clusterings
ACM Transactions on Knowledge Discovery from Data (TKDD)
Finding microarray genes using GO ontology
Proceedings of the 1st Amrita ACM-W Celebration on Women in Computing in India
Automatic parameter determination in subspace clustering with gravitation function
Proceedings of the Fourteenth International Database Engineering & Applications Symposium
A mixture model with sharing for lexical semantics
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Can shared-neighbor distances defeat the curse of dimensionality?
SSDBM'10 Proceedings of the 22nd international conference on Scientific and statistical database management
Semi-supervised projection clustering with transferred centroid regularization
ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part III
Mining relaxed closed subspace clusters
Proceedings of the 48th Annual Southeast Regional Conference
Mixture graph based semi-supervised dimensionality reduction
Pattern Recognition and Image Analysis
Document clustering using synthetic cluster prototypes
Data & Knowledge Engineering
Proceedings of the 14th International Conference on Extending Database Technology
Making interval-based clustering rank-aware
Proceedings of the 14th International Conference on Extending Database Technology
sub-space clustering and evidence accumulation for unsupervised network anomaly detection
TMA'11 Proceedings of the Third international conference on Traffic monitoring and analysis
UNADA: unsupervised network anomaly detection using sub-space outliers ranking
NETWORKING'11 Proceedings of the 10th international IFIP TC 6 conference on Networking - Volume Part I
An extension of the PMML standard to subspace clustering models
Proceedings of the 2011 workshop on Predictive markup language modeling
Hybrid parallel classifiers for semantic subspace learning
ICANN'11 Proceedings of the 21st international conference on Artificial neural networks - Volume Part II
A new clustering algorithm with the convergence proof
KES'11 Proceedings of the 15th international conference on Knowledge-based and intelligent information and engineering systems - Volume Part I
MINETRAC: mining flows for unsupervised analysis & semi-supervised classification
Proceedings of the 23rd International Teletraffic Congress
Semi-supervised classification based on random subspace dimensionality reduction
Pattern Recognition
Scalable density-based subspace clustering
Proceedings of the 20th ACM international conference on Information and knowledge management
External evaluation measures for subspace clustering
Proceedings of the 20th ACM international conference on Information and knowledge management
Scene image clustering based on boosting and GMM
Proceedings of the Second Symposium on Information and Communication Technology
Model-based multidimensional clustering of categorical data
Artificial Intelligence
Supervised learning in parallel universes using neighborgrams
IDA'11 Proceedings of the 10th international conference on Advances in intelligent data analysis X
Subspace clustering of microarray data based on domain transformation
VDMB'06 Proceedings of the First international conference on Data Mining and Bioinformatics
Simultaneous model-based clustering and visualization in the Fisher discriminative subspace
Statistics and Computing
Research paper recommender systems: a subspace clustering approach
WAIM'05 Proceedings of the 6th international conference on Advances in Web-Age Information Management
On the performance of feature weighting K-means for text subspace clustering
WAIM'05 Proceedings of the 6th international conference on Advances in Web-Age Information Management
Supplier categorization with K-means type subspace clustering
APWeb'06 Proceedings of the 8th Asia-Pacific Web conference on Frontiers of WWW Research and Development
DHCC: Divisive hierarchical clustering of categorical data
Data Mining and Knowledge Discovery
Tracing Evolving Subspace Clusters in Temporal Climate Data
Data Mining and Knowledge Discovery
Subspace clustering of text documents with feature weighting k-means algorithm
PAKDD'05 Proceedings of the 9th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
An evaluation of filter and wrapper methods for feature selection in categorical clustering
IDA'05 Proceedings of the 6th international conference on Advances in Intelligent Data Analysis
Subclass-Oriented dimension reduction with constraint transformation and manifold regularization
PAKDD'10 Proceedings of the 14th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part II
Comparing dimension reduction techniques for document clustering
AI'05 Proceedings of the 18th Canadian Society conference on Advances in Artificial Intelligence
Proceedings of the 7th International Conference on Network and Services Management
BioLog: a browser based collaboration and resource navigation assistant for biomedical researchers
DILS'05 Proceedings of the Second international conference on Data Integration in the Life Sciences
Integrating heterogeneous microarray data sources using correlation signatures
DILS'05 Proceedings of the Second international conference on Data Integration in the Life Sciences
Feature interaction in subspace clustering using the Choquet integral
Pattern Recognition
Semi-supervised ensemble classification in subspaces
Applied Soft Computing
Unsupervised Network Intrusion Detection Systems: Detecting the Unknown without Knowledge
Computer Communications
Object localization by subspace clustering of local descriptors
ICVGIP'06 Proceedings of the 5th Indian conference on Computer Vision, Graphics and Image Processing
XML document clustering by independent component analysis
KDXD'06 Proceedings of the First international conference on Knowledge Discovery from XML Documents
A fast subspace text categorization method using parallel classifiers
CICLing'12 Proceedings of the 13th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part II
Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery
Clustering high dimensional data
Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery
Screening nonrandomized studies for medical systematic reviews: A comparative study of classifiers
Artificial Intelligence in Medicine
Co-Segmentation of 3D Shapes via Subspace Clustering
Computer Graphics Forum
Linear semi-supervised projection clustering by transferred centroid regularization
Journal of Intelligent Information Systems
A survey on unsupervised outlier detection in high-dimensional numerical data
Statistical Analysis and Data Mining
Parsimonious Mahalanobis kernel for the classification of high dimensional data
Pattern Recognition
A survey on enhanced subspace clustering
Data Mining and Knowledge Discovery
Projective clustering ensembles
Data Mining and Knowledge Discovery
Data Field for Hierarchical Clustering
International Journal of Data Warehousing and Mining
WI-IAT '12 Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 03
Flat vs. symbiotic evolutionary subspace clusterings
Proceedings of the 15th annual conference companion on Genetic and evolutionary computation
A clustering ensemble framework based on elite selection of weighted clusters
Advances in Data Analysis and Classification
Column Generation for the Minimum Hyperplanes Clustering Problem
INFORMS Journal on Computing
Mining order-preserving submatrices from probabilistic matrices
ACM Transactions on Database Systems (TODS)
Finding multiple global linear correlations in sparse and noisy data sets
Knowledge-Based Systems
Text Document Clustering with Hybrid Feature Selection
Proceedings of International Conference on Information Integration and Web-based Applications & Services
Scalable K-Means by ranked retrieval
Proceedings of the 7th ACM international conference on Web search and data mining
Hybrid entity clustering using crowds and data
The VLDB Journal — The International Journal on Very Large Data Bases
Evolving soft subspace clustering
Applied Soft Computing
Model-based clustering of high-dimensional data: A review
Computational Statistics & Data Analysis
Shape classification by manifold learning in multiple observation spaces
Information Sciences: an International Journal
Hybrid classifiers based on semantic data subspaces for two-level text categorization
International Journal of Hybrid Intelligent Systems
International Journal of Hybrid Intelligent Systems
Tensor clustering via adaptive subspace iteration
Intelligent Data Analysis
Semi-supervised projected model-based clustering
Data Mining and Knowledge Discovery
Hi-index | 0.00 |
Subspace clustering is an extension of traditional clustering that seeks to find clusters in different subspaces within a dataset. Often in high dimensional data, many dimensions are irrelevant and can mask existing clusters in noisy data. Feature selection removes irrelevant and redundant dimensions by analyzing the entire dataset. Subspace clustering algorithms localize the search for relevant dimensions allowing them to find clusters that exist in multiple, possibly overlapping subspaces. There are two major branches of subspace clustering based on their search strategy. Top-down algorithms find an initial clustering in the full set of dimensions and evaluate the subspaces of each cluster, iteratively improving the results. Bottom-up approaches find dense regions in low dimensional spaces and combine them to form clusters. This paper presents a survey of the various subspace clustering algorithms along with a hierarchy organizing the algorithms by their defining characteristics. We then compare the two main approaches to subspace clustering using empirical scalability and accuracy tests and discuss some potential applications where subspace clustering could be particularly useful.