Algorithms for clustering data
Algorithms for clustering data
Resource allocation problems: algorithmic approaches
Resource allocation problems: algorithmic approaches
Social information filtering: algorithms for automating “word of mouth”
CHI '95 Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
BIRCH: an efficient data clustering method for very large databases
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
A cost model for nearest neighbor search in high-dimensional data space
PODS '97 Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
CURE: an efficient clustering algorithm for large databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Automatic subspace clustering of high dimensional data for data mining applications
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
A comparative study of clustering methods
Future Generation Computer Systems - Special double issue on data mining
Knowledge Acquisition Via Incremental Conceptual Clustering
Machine Learning
A Distribution-Based Clustering Algorithm for Mining in Large Spatial Databases
ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
Clustering Categorical Data: An Approach Based on Dynamical Systems
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Efficient and Effective Clustering Methods for Spatial Data Mining
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
SSD '95 Proceedings of the 4th International Symposium on Advances in Spatial Databases
Scalable algorithms for mining large databases
KDD '99 Tutorial notes of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Horting hatches an egg: a new graph-theoretic approach to collaborative filtering
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Finding generalized projected clusters in high dimensional spaces
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
SQLEM: fast clustering in SQL using the EM algorithm
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Approximation algorithms for projective clustering
SODA '00 Proceedings of the eleventh annual ACM-SIAM symposium on Discrete algorithms
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Clustering through decision tree construction
Proceedings of the ninth international conference on Information and knowledge management
Outlier detection for high dimensional data
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
A human-computer cooperative system for effective high dimensional clustering
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Towards effective and interpretable data mining by visual interaction
ACM SIGKDD Explorations Newsletter
The convex polyhedra technique: an index structure for high-dimensional space
ADC '02 Proceedings of the 13th Australasian database conference - Volume 5
Clustering by pattern similarity in large data sets
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
A Monte Carlo algorithm for fast projective clustering
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Transactional Multimedia Banner as Web Access Point
Electronic Commerce Research - Special issue on agents in electronic commerce
An iterative strategy for pattern discovery in high-dimensional data sets
Proceedings of the eleventh international conference on Information and knowledge management
Hyper-rectangle based segmentation and clustering of large video data sets
Information Sciences—Informatics and Computer Science: An International Journal - Special issue: Intelligent multimedia computing and networking
Redefining Clustering for High-Dimensional Applications
IEEE Transactions on Knowledge and Data Engineering
Using Projections to Visually Cluster High-Dimensional Data
Computing in Science and Engineering
What Is the Nearest Neighbor in High Dimensional Spaces?
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Local Dimensionality Reduction: A New Approach to Indexing High Dimensional Spaces
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
CoFD: An Algorithm for Non-distance Based Clustering in High Dimensional Spaces
DaWaK 2000 Proceedings of the 4th International Conference on Data Warehousing and Knowledge Discovery
Data Reduction via Conflicting Data Analysis
ISMIS '00 Proceedings of the 12th International Symposium on Foundations of Intelligent Systems
Feature Selection for Clustering
PADKK '00 Proceedings of the 4th Pacific-Asia Conference on Knowledge Discovery and Data Mining, Current Issues and New Applications
Data Mining and Personalization Technologies
DASFAA '99 Proceedings of the Sixth International Conference on Database Systems for Advanced Applications
Subspace Clustering Based on Compressibility
DS '02 Proceedings of the 5th International Conference on Discovery Science
Approximation Algorithms for k-Line Center
ESA '02 Proceedings of the 10th Annual European Symposium on Algorithms
RecTree: An Efficient Collaborative Filtering Method
DaWaK '01 Proceedings of the Third International Conference on Data Warehousing and Knowledge Discovery
Approximation algorithms for projective clustering
Journal of Algorithms
Clustering binary data streams with K-means
DMKD '03 Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery
Clustering gene expression data in SQL using locally adaptive metrics
DMKD '03 Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery
IEEE Transactions on Knowledge and Data Engineering
Analyzing High-Dimensional Data by Subspace Validity
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Frequent-Pattern based Iterative Projected Clustering
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
OP-Cluster: Clustering by Tendency in High Dimensional Space
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
MaPle: A Fast Algorithm for Maximal Pattern-based Clustering
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
A new distributed data mining model based on similarity
Proceedings of the 2003 ACM symposium on Applied computing
On Using Partial Supervision for Text Categorization
IEEE Transactions on Knowledge and Data Engineering
A Human-Computer Interactive Method for Projected Clustering
IEEE Transactions on Knowledge and Data Engineering
Using emerging pattern based projected clustering and gene expression data for cancer detection
APBC '04 Proceedings of the second conference on Asia-Pacific bioinformatics - Volume 29
Information Visualization - Special issue on coordinated and multiple views in exploratory visualization
Hypergraph Models and Algorithms for Data-Pattern-Based Clustering
Data Mining and Knowledge Discovery
Computing Clusters of Correlation Connected objects
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Subspace clustering for high dimensional data: a review
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Document clustering via adaptive subspace iteration
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Efficient Disk-Based K-Means Clustering for Relational Databases
IEEE Transactions on Knowledge and Data Engineering
A framework for ontology-driven subspace clustering
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
HARP: A Practical Projected Clustering Algorithm
IEEE Transactions on Knowledge and Data Engineering
Automatic image annotation and retrieval using subspace clustering algorithm
Proceedings of the 2nd ACM international workshop on Multimedia databases
IEEE Transactions on Knowledge and Data Engineering
Iterative Projected Clustering by Subspace Mining
IEEE Transactions on Knowledge and Data Engineering
Identifying projected clusters from gene expression profiles
Journal of Biomedical Informatics
Subspace clustering for high dimensional categorical data
ACM SIGKDD Explorations Newsletter
CVA file: an index structure for high-dimensional datasets
Knowledge and Information Systems
Projective Clustering by Histograms
IEEE Transactions on Knowledge and Data Engineering
On Discovery of Extremely Low-Dimensional Clusters Using Semi-Supervised Projected Clustering
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
An effective and efficient algorithm for high-dimensional outlier detection
The VLDB Journal — The International Journal on Very Large Data Bases
Array-index: a plug&search K nearest neighbors method for high-dimensional data
Data & Knowledge Engineering
CURLER: finding and visualizing nonlinear correlation clusters
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
TRICLUSTER: an effective algorithm for mining coherent clusters in 3D microarray data
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
iDistance: An adaptive B+-tree based indexing method for nearest neighbor search
ACM Transactions on Database Systems (TODS)
Automatic Subspace Clustering of High Dimensional Data
Data Mining and Knowledge Discovery
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
A general model for clustering binary data
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Cross-relational clustering with user's guidance
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
A Shrinking-Based Clustering Approach for Multidimensional Data
IEEE Transactions on Knowledge and Data Engineering
A rank-by-feature framework for interactive exploration of multidimensional data
Information Visualization
A Generic Framework for Efficient Subspace Clustering of High-Dimensional Data
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Categorization and Keyword Identification of Unlabeled Documents
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Mining Quantitative Frequent Itemsets Using Adaptive Density-Based Subspace Clustering
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Matrix approximation and projective clustering via volume sampling
SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
Automatic image annotation and retrieval using weighted feature selection
Multimedia Tools and Applications
MicroCluster: Efficient Deterministic Biclustering of Microarray Data
IEEE Intelligent Systems
Adherence clustering: an efficient method for mining market-basket clusters
Information Systems
A comprehensive comparison study of document clustering for a biomedical digital library MEDLINE
Proceedings of the 6th ACM/IEEE-CS joint conference on Digital libraries
Deriving quantitative models for correlation clusters
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Unsupervised learning on k-partite graphs
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Projected clustering for categorical datasets
Pattern Recognition Letters
LinkClus: efficient clustering via heterogeneous semantic links
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Projective clustering using itemset discovery for multi-dimensional data analysis
MS'06 Proceedings of the 17th IASTED international conference on Modelling and simulation
Finding biclusters by random projections
Theoretical Computer Science
Cell-nuclear data reduction and prognostic model selection in bladder tumor recurrence
Artificial Intelligence in Medicine
A dimensionality reduction algorithm and its application for interactive visualization
Journal of Visual Languages and Computing
Constrained data clustering by depth control and progressive constraint relaxation
The VLDB Journal — The International Journal on Very Large Data Bases
An adaptive and dynamic dimensionality reduction method for high-dimensional indexing
The VLDB Journal — The International Journal on Very Large Data Bases
Locally adaptive metrics for clustering high dimensional data
Data Mining and Knowledge Discovery
Bi-criteria linear-time approximations for generalized k-mean/median/center
SCG '07 Proceedings of the twenty-third annual symposium on Computational geometry
Linear manifold clustering in high dimensional spaces by stochastic search
Pattern Recognition
Reverse Nearest Neighbors Search in Ad Hoc Subspaces
IEEE Transactions on Knowledge and Data Engineering
MESO: Supporting Online Decision Making in Autonomic Computing Systems
IEEE Transactions on Knowledge and Data Engineering
Toward Exploratory Test-Instance-Centered Diagnosis in High-Dimensional Classification
IEEE Transactions on Knowledge and Data Engineering
An Entropy Weighting k-Means Algorithm for Subspace Clustering of High-Dimensional Sparse Data
IEEE Transactions on Knowledge and Data Engineering
Xproj: a framework for projected structural clustering of xml documents
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Enhancing semi-supervised clustering: a feature projection perspective
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
A Sketch Algorithm for Estimating Two-Way and Multi-Way Associations
Computational Linguistics
Algorithms for clustering high dimensional and distributed data
Intelligent Data Analysis
Association-based similarity testing and its applications
Intelligent Data Analysis
Detecting eye fixations by projection clustering
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
A shrinking-based approach for multi-dimensional data analysis
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
A framework for projected clustering of high dimensional data streams
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Learning correlations using the mixture-of-subsets model
ACM Transactions on Knowledge Discovery from Data (TKDD)
A clustering framework based on subjective and objective validity criteria
ACM Transactions on Knowledge Discovery from Data (TKDD)
Continuous subspace clustering in streaming time series
Information Systems
Random walk biclustering for microarray data
Information Sciences: an International Journal
A convergence theorem for the fuzzy subspace clustering (FSC) algorithm
Pattern Recognition
Biomedical ontology improves biomedical literature clustering performance: a comparison study
International Journal of Bioinformatics Research and Applications
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Morpheus: interactive exploration of subspace clustering
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
High-Dimensional Clustering Method for High Performance Data Mining
ICCS '07 Proceedings of the 7th international conference on Computational Science, Part III: ICCS 2007
ELKI: A Software System for Evaluation of Subspace Clustering Algorithms
SSDBM '08 Proceedings of the 20th international conference on Scientific and Statistical Database Management
Pleiades: Subspace Clustering and Evaluation
ECML PKDD '08 Proceedings of the European conference on Machine Learning and Knowledge Discovery in Databases - Part II
Clustering based on matrix approximation: a unifying view
Knowledge and Information Systems
Constrained locally weighted clustering
Proceedings of the VLDB Endowment
Proceedings of the VLDB Endowment
EDSC: efficient density-based subspace clustering
Proceedings of the 17th ACM conference on Information and knowledge management
ACM Transactions on Knowledge Discovery from Data (TKDD)
Efficiently tracing clusters over high-dimensional on-line data streams
Data & Knowledge Engineering
Improving Accuracy of Recommender System by Item Clustering
IEICE - Transactions on Information and Systems
SLICE: A Novel Method to Find Local Linear Correlations by Constructing Hyperplanes
APWeb/WAIM '09 Proceedings of the Joint International Conferences on Advances in Data and Web Management
Clustering by pattern similarity
Journal of Computer Science and Technology
Query result clustering for object-level search
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
A semi-supervised approach to projected clustering with applications to microarray data
International Journal of Data Mining and Bioinformatics
Heidi matrix: nearest neighbor driven high dimensional data visualization
Proceedings of the ACM SIGKDD Workshop on Visual Analytics and Knowledge Discovery: Integrating Automated Analysis with Interactive Exploration
Proceedings of the 46th Annual Southeast Regional Conference on XX
ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part II
Discovering pattern-based subspace clusters by pattern tree
Knowledge-Based Systems
Subspace and projected clustering: experimental evaluation and analysis
Knowledge and Information Systems
Evaluating clustering in subspace projections of high dimensional data
Proceedings of the VLDB Endowment
Projected Gustafson Kessel Clustering
RSFDGrC '09 Proceedings of the 12th International Conference on Rough Sets, Fuzzy Sets, Data Mining and Granular Computing
Adherence clustering: an efficient method for mining market-basket clusters
Information Systems
SKM-SNP: SNP markers detection method
Journal of Biomedical Informatics
A fast algorithm for finding correlation clusters in noise data
PAKDD'07 Proceedings of the 11th Pacific-Asia conference on Advances in knowledge discovery and data mining
Clustering by random projections
ICDM'07 Proceedings of the 7th industrial conference on Advances in data mining: theoretical aspects and applications
Detection and visualization of subspace cluster hierarchies
DASFAA'07 Proceedings of the 12th international conference on Database systems for advanced applications
Exploring the power of heuristics and links in multi-relational data mining
ISMIS'08 Proceedings of the 17th international conference on Foundations of intelligent systems
High-dimensional indexing: transformational approaches to high-dimensional range and similarity searches
Genetic algorithm-based high-dimensional data clustering technique
FSKD'09 Proceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 1
Distance based feature selection for clustering microarray data
DASFAA'08 Proceedings of the 13th international conference on Database systems for advanced applications
Mining Outliers in Correlated Subspaces for High Dimensional Data Sets
Fundamenta Informaticae - Intelligent Data Analysis in Granular Computing
Learning in parallel universes
Data Mining and Knowledge Discovery
Mixture models for learning low-dimensional roles in high-dimensional data
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Empirical comparison of techniques for automated failure diagnosis
SysML'08 Proceedings of the Third conference on Tackling computer systems problems with machine learning techniques
Automatic parameter determination in subspace clustering with gravitation function
Proceedings of the Fourteenth International Database Engineering & Applications Symposium
Mining relaxed closed subspace clusters
Proceedings of the 48th Annual Southeast Regional Conference
Towards improving subspace data analysis
Proceedings of the 48th Annual Southeast Regional Conference
Proceedings of the 14th International Conference on Extending Database Technology
An entropy weighting mixture model for subspace clustering of high-dimensional data
Pattern Recognition Letters
Advancing data clustering via projective clustering ensembles
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Projected Gustafson-Kessel clustering algorithm and its convergence
Transactions on rough sets XIV
Clustering very large multi-dimensional datasets with MapReduce
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Agent-based subspace clustering
PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part II
An extension of the PMML standard to subspace clustering models
Proceedings of the 2011 workshop on Predictive markup language modeling
Hybrid-LWM: A linear-model based hybrid clustering algorithm for supplier categorisation
International Journal of Systems, Control and Communications
Efficient selectivity estimation by histogram construction based on subspace clustering
SSDBM'11 Proceedings of the 23rd international conference on Scientific and statistical database management
Learning from label proportions by optimizing cluster model selection
ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part III
Comparing apples and oranges: measuring differences between data mining results
ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part III
Scalable density-based subspace clustering
Proceedings of the 20th ACM international conference on Information and knowledge management
External evaluation measures for subspace clustering
Proceedings of the 20th ACM international conference on Information and knowledge management
CLINCH: clustering incomplete high-dimensional data for data mining application
APWeb'05 Proceedings of the 7th Asia-Pacific web conference on Web Technologies Research and Development
Indexing text and visual features for WWW images
APWeb'05 Proceedings of the 7th Asia-Pacific web conference on Web Technologies Research and Development
Finding hierarchies of subspace clusters
PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases
A near-linear algorithm for projective clustering integer points
Proceedings of the twenty-third annual ACM-SIAM symposium on Discrete Algorithms
An incremental updating method for clustering-based high-dimensional data indexing
CIS'05 Proceedings of the 2005 international conference on Computational Intelligence and Security - Volume Part I
Set-Oriented dimension reduction: localizing principal component analysis via hidden markov models
CompLife'06 Proceedings of the Second international conference on Computational Life Sciences
Generalized projected clustering in high-dimensional data streams
APWeb'06 Proceedings of the 8th Asia-Pacific Web conference on Frontiers of WWW Research and Development
A fuzzy subspace algorithm for clustering high dimensional data
ADMA'06 Proceedings of the Second international conference on Advanced Data Mining and Applications
DHCC: Divisive hierarchical clustering of categorical data
Data Mining and Knowledge Discovery
An efficient clustering and indexing approach over large video sequences
PCM'06 Proceedings of the 7th Pacific Rim conference on Advances in Multimedia Information Processing
SSC: statistical subspace clustering
MLDM'05 Proceedings of the 4th international conference on Machine Learning and Data Mining in Pattern Recognition
MLDM'05 Proceedings of the 4th international conference on Machine Learning and Data Mining in Pattern Recognition
A grid-based clustering algorithm for high-dimensional data streams
ADMA'05 Proceedings of the First international conference on Advanced Data Mining and Applications
A new cell-based clustering method for high-dimensional data mining applications
KES'05 Proceedings of the 9th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part I
Density estimation for spatial data streams
SSTD'05 Proceedings of the 9th international conference on Advances in Spatial and Temporal Databases
Feature interaction in subspace clustering using the Choquet integral
Pattern Recognition
SC-tree: an efficient structure for high-dimensional data indexing
BNCOD'06 Proceedings of the 23rd British National Conference on Databases, conference on Flexible and Efficient Information Handling
Adaptive sampling and fast low-rank matrix approximation
APPROX'06/RANDOM'06 Proceedings of the 9th international conference on Approximation Algorithms for Combinatorial Optimization Problems, and 10th international conference on Randomization and Computation
Query-Based video event definition using rough set theory and high-dimensional representation
MMM'10 Proceedings of the 16th international conference on Advances in Multimedia Modeling
A robust seedless algorithm for correlation clustering
PAKDD'10 Proceedings of the 14th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I
Features selection from high-dimensional web data using clustering analysis
Proceedings of the 2nd International Conference on Web Intelligence, Mining and Semantics
Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery
Mining of temporal coherent subspace clusters in multivariate time series databases
PAKDD'12 Proceedings of the 16th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I
Compression-aware I/O performance analysis for big data clustering
Proceedings of the 1st International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications
Density-Based projected clustering of data streams
SUM'12 Proceedings of the 6th international conference on Scalable Uncertainty Management
A New Locally Weighted K-Means for Cancer-Aided Microarray Data Analysis
Journal of Medical Systems
A survey on enhanced subspace clustering
Data Mining and Knowledge Discovery
On the equivalence of PLSI and projected clustering
ACM SIGMOD Record
Projective clustering ensembles
Data Mining and Knowledge Discovery
Color Image Segmentation: From the View of Projective Clustering
International Journal of Multimedia Data Engineering & Management
Interactive data mining with 3D-parallel-coordinate-trees
Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Outlier ensembles: position paper
ACM SIGKDD Explorations Newsletter
Fuzzy partition based soft subspace clustering and its applications in high dimensional data
Information Sciences: an International Journal
Finding contexts of social influence in online social networks
Proceedings of the 7th Workshop on Social Network Mining and Analysis
GPUMAFIA: efficient subspace clustering with MAFIA on GPUs
Euro-Par'13 Proceedings of the 19th international conference on Parallel Processing
Mining order-preserving submatrices from probabilistic matrices
ACM Transactions on Database Systems (TODS)
Finding multiple global linear correlations in sparse and noisy data sets
Knowledge-Based Systems
Hybrid entity clustering using crowds and data
The VLDB Journal — The International Journal on Very Large Data Bases
A multivariate fuzzy system applied for outliers detection
Journal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology
Semi-supervised projected model-based clustering
Data Mining and Knowledge Discovery
Subspace clustering of high-dimensional data: an evolutionary approach
Applied Computational Intelligence and Soft Computing
Hi-index | 0.00 |
The clustering problem is well known in the database literature for its numerous applications in problems such as customer segmentation, classification and trend analysis. Unfortunately, all known algorithms tend to break down in high dimensional spaces because of the inherent sparsity of the points. In such high dimensional spaces not all dimensions may be relevant to a given cluster. One way of handling this is to pick the closely correlated dimensions and find clusters in the corresponding subspace. Traditional feature selection algorithms attempt to achieve this. The weakness of this approach is that in typical high dimensional data mining applications different sets of points may cluster better for different subsets of dimensions. The number of dimensions in each such cluster-specific subspace may also vary. Hence, it may be impossible to find a single small subset of dimensions for all the clusters. We therefore discuss a generalization of the clustering problem, referred to as the projected clustering problem, in which the subsets of dimensions selected are specific to the clusters themselves. We develop an algorithmic framework for solving the projected clustering problem, and test its performance on synthetic data.