Automatic subspace clustering of high dimensional data for data mining applications
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Fast algorithms for projected clustering
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Entropy-based subspace clustering for mining numerical data
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Finding generalized projected clusters in high dimensional spaces
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Co-clustering documents and words using bipartite spectral graph partitioning
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Clustering by pattern similarity in large data sets
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Discovering local structure in gene expression data: the order-preserving submatrix problem
Proceedings of the sixth annual international conference on Computational biology
Biclustering of Expression Data
Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology
Semantic Compression and Pattern Extraction with Fascicles
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Comparing Subspace Clusterings
IEEE Transactions on Knowledge and Data Engineering
Hierarchical, Parameter-Free Community Discovery
ECML PKDD '08 Proceedings of the European conference on Machine Learning and Knowledge Discovery in Databases - Part II
Fuzzy c-means clustering with prior biological knowledge
Journal of Biomedical Informatics
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Data clustering: 50 years beyond K-means
Pattern Recognition Letters
Mining the “Voice of the Customer” for Business Prioritization
ACM Transactions on Intelligent Systems and Technology (TIST)
PAKDD'06 Proceedings of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
MicroClAn: Microarray clustering analysis
Journal of Parallel and Distributed Computing
Comparative meta-analysis between human and mouse cancer microarray data reveals critical pathways
International Journal of Data Mining and Bioinformatics
Hi-index | 0.00 |
Traditional clustering is a descriptive task that seeks to identify homogeneous groups of objects based on the values of their attributes. While domain knowledge is always the best way to justify clustering, few clustering algorithms have ever take domain knowledge into consideration. In this paper, the domain knowledge is represented by hierarchical ontology. We develop a framework by directly incorporating domain knowledge into clustering process, yielding a set of clusters with strong ontology implication. During the clustering process, ontology information is utilized to efficiently prune the exponential search space of the subspace clustering algorithms. Meanwhile, the algorithm generates automatical interpretation of the clustering result by mapping the natural hierarchical organized subspace clusters with significant categorical enrichment onto the ontology hierarchy. Our experiments on a set of gene expression data using gene ontology demonstrate that our pruning technique driven by ontology significantly improve the clustering performance with minimal degradation of the cluster quality. Meanwhile, many hierarchical organizations of gene clusters corresponding to a sub-hierarchies in gene ontology were also successfully captured.