BIRCH: an efficient data clustering method for very large databases
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Advances in knowledge discovery and data mining
Advances in knowledge discovery and data mining
CURE: an efficient clustering algorithm for large databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Automatic subspace clustering of high dimensional data for data mining applications
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Optimal multi-step k-nearest neighbor search
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
OPTICS: ordering points to identify the clustering structure
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Fast algorithms for projected clustering
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
LOF: identifying density-based local outliers
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Efficient algorithms for mining outliers from large data sets
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Outlier detection for high dimensional data
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Findout: finding outliers in very large datasets
Knowledge and Information Systems
Refining Initial Points for K-Means Clustering
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Algorithms for Mining Distance-Based Outliers in Large Datasets
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
WaveCluster: A Multi-Resolution Clustering Approach for Very Large Spatial Databases
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Efficient and Effective Clustering Methods for Spatial Data Mining
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
STING: A Statistical Information Grid Approach to Spatial Data Mining
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
ROCK: A Robust Clustering Algorithm for Categorical Attributes
ICDE '99 Proceedings of the 15th International Conference on Data Engineering
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Proceedings of the 46th Annual Southeast Regional Conference on XX
Hi-index | 0.00 |
In this paper, we present continuous research on data analysis based on our previous work on cluster-outlier iterative detection approach in subspace. Based on the observation that, for noisy data sets, clusters and outliers can not be processed efficiently when they are handled separately from each other, we proposed a cluster-outlier iterative detection algorithm in full data space in our previous work [22]. Due to the fact that the real data sets normally have high dimensionality, and natural clusters and outliers do not exist in the full data space, we proposed an algorithm (SubCOID) to detect clusters and outliers in subspace [21]. However, it is not a trivial task to associate each cluster and each outlier with different subsets of dimensions. In this paper, we present the improved SubCOID algorithm, applying some novel approach to choosing a unique subset of dimensions for each cluster and each outlier. The selection is based on the intra-relationship within clusters, the intra-relationship within outliers, and the inter-relationship between clusters and outliers. This process is performed iteratively until a certain termination condition is reached. This data processing algorithm can be applied in many fields such as pattern recognition, data clustering and signal processing.