Clustering high dimensional data: A graph-based relaxed optimization approach
Information Sciences: an International Journal
Exploring the power of heuristics and links in multi-relational data mining
ISMIS'08 Proceedings of the 17th international conference on Foundations of intelligent systems
Mining knowledge from databases: an information network analysis approach
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
ClustCube: an OLAP-based framework for clustering and mining complex database objects
Proceedings of the 2011 ACM Symposium on Applied Computing
Potential role based entity matching for dataspaces search
WISE'10 Proceedings of the 11th international conference on Web information systems engineering
OLAP over continuous domains via density-based hierarchical clustering
KES'11 Proceedings of the 15th international conference on Knowledge-based and intelligent information and engineering systems - Volume Part II
Multi-relational data semi-supervised K-means clustering algorithm
AICI'11 Proceedings of the Third international conference on Artificial intelligence and computational intelligence - Volume Part I
Generalized Adjusted Rand Indices for cluster ensembles
Pattern Recognition
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Enhanced clustering of complex database objects in the clustcube framework
Proceedings of the fifteenth international workshop on Data warehousing and OLAP
Exploiting Forum Thread Structures to Improve Thread Clustering
Proceedings of the 2013 Conference on the Theory of Information Retrieval
ACM Transactions on Knowledge Discovery from Data (TKDD) - Special Issue on ACM SIGKDD 2012
Hi-index | 0.00 |
Most structured data in real-life applications are stored in relational databases containing multiple semantically linked relations. Unlike clustering in a single table, when clustering objects in relational databases there are usually a large number of features conveying very different semantic information, and using all features indiscriminately is unlikely to generate meaningful results. Because the user knows her goal of clustering, we propose a new approach called CrossClus, which performs multi-relational clustering under user's guidance. Unlike semi-supervised clustering which requires the user to provide a training set, we minimize the user's effort by using a very simple form of user guidance. The user is only required to select one or a small set of features that are pertinent to the clustering goal, and CrossClus searches for other pertinent features in multiple relations. Each feature is evaluated by whether it clusters objects in a similar way with the user specified features. We design efficient and accurate approaches for both feature selection and object clustering. Our comprehensive experiments demonstrate the effectiveness and scalability of CrossClus.