Automatic subspace clustering of high dimensional data for data mining applications
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Fast algorithms for projected clustering
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Finding generalized projected clusters in high dimensional spaces
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
A Monte Carlo algorithm for fast projective clustering
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values
Data Mining and Knowledge Discovery
Local Dimensionality Reduction: A New Approach to Indexing High Dimensional Spaces
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Feature Weighting in k-Means Clustering
Machine Learning
Subspace clustering for high dimensional data: a review
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
HARP: A Practical Projected Clustering Algorithm
IEEE Transactions on Knowledge and Data Engineering
Automated Variable Weighting in k-Means Type Clustering
IEEE Transactions on Pattern Analysis and Machine Intelligence
On Discovery of Extremely Low-Dimensional Clusters Using Semi-Supervised Projected Clustering
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
An Entropy Weighting k-Means Algorithm for Subspace Clustering of High-Dimensional Sparse Data
IEEE Transactions on Knowledge and Data Engineering
Subspace clustering of text documents with feature weighting k-means algorithm
PAKDD'05 Proceedings of the 9th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
A 2-phased approach for detecting multiple loci associations with traits
International Journal of Data Mining and Bioinformatics
Hi-index | 0.00 |
SKM-SNP, SNP markers detection program, is proposed to identify a set of relevant SNPs for the association between a disease and multiple marker genotypes. We employ a subspace categorical clustering algorithm to compute a weight for each SNP in the group of patient samples and the group of normal samples, and use the weights to identify the subsets of relevant SNPs that categorize these two groups. The experiments on both Schizophrenia and Parkinson Disease data sets containing genome-wide SNPs are reported to demonstrate the program. Results indicate that our method can find some relevant SNPs that categorize the disease samples. The online SKM-SNP program is available at http://www.math.hkbu.edu.hk/~mng/SKM-SNP/SKM-SNP.html.