Automatic subspace clustering of high dimensional data for data mining applications
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Data preparation for data mining
Data preparation for data mining
Entropy-based subspace clustering for mining numerical data
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Finding generalized projected clusters in high dimensional spaces
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
LOF: identifying density-based local outliers
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Outlier detection for high dimensional data
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Empirical bayes screening for multi-item associations
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Mean Shift: A Robust Approach Toward Feature Space Analysis
IEEE Transactions on Pattern Analysis and Machine Intelligence
Discovering local structure in gene expression data: the order-preserving submatrix problem
Proceedings of the sixth annual international conference on Computational biology
Density-Based Clustering in Spatial Databases: The Algorithm GDBSCAN and Its Applications
Data Mining and Knowledge Discovery
IEEE Transactions on Knowledge and Data Engineering
On the Surprising Behavior of Distance Metrics in High Dimensional Spaces
ICDT '01 Proceedings of the 8th International Conference on Database Theory
Biclustering of Expression Data
Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology
Finding surprising patterns in a time series database in linear time and space
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Minimum Redundancy Feature Selection from Microarray Gene Expression Data
CSB '03 Proceedings of the IEEE Computer Society Conference on Bioinformatics
Parallel coordinates: a tool for visualizing multi-dimensional geometry
VIS '90 Proceedings of the 1st conference on Visualization '90
Clustering of Time Series Subsequences is Meaningless: Implications for Previous and Future Research
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Subspace clustering for high dimensional data: a review
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Subspace Selection for Clustering High-Dimensional Data
ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
Automatic Subspace Clustering of High Dimensional Data
Data Mining and Knowledge Discovery
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Analyzing time series gene expression data
Bioinformatics
Clustering short time series gene expression data
Bioinformatics
Introduction to Data Mining, (First Edition)
Introduction to Data Mining, (First Edition)
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Enhancing Data Analysis with Noise Removal
IEEE Transactions on Knowledge and Data Engineering
Discovering significant OPSM subspace clusters in massive gene expression data
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Effective similarity measures for expression profiles
Bioinformatics
Mining gene–sample–time microarray data: a coherent gene cluster discovery approach
Knowledge and Information Systems
Pattern-based time-series subsequence clustering using radial distribution functions
Knowledge and Information Systems
The curse of dimensionality in data mining and time series prediction
IWANN'05 Proceedings of the 8th international conference on Artificial Neural Networks: computational Intelligence and Bioinspired Systems
Hi-index | 0.00 |
An algorithm is introduced that distinguishes relevant data points from randomly distributed noise. The algorithm is related to subspace clustering based on axis-parallel projections, but considers membership in any projected cluster of a given side length, as opposed to a particular cluster. An aggregate measure is introduced that is based on the total number of points that are close to the given point in all possible 2 d projections of a d-dimensional hypercube. No explicit summation over subspaces is required for evaluating this measure. Attribute values are normalized based on rank order to avoid making assumptions on the distribution of random data. Effectiveness of the algorithm is demonstrated through comparison with conventional outlier detection on a real microarray data set as well as on time series subsequence data.