CrossClus: user-guided multi-relational clustering

Authors:
Xiaoxin Yin;Jiawei Han;Philip S. Yu
Affiliations:
Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, USA;Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, USA;IBM T.J. Watson Research Center, Yorktown Heights, USA
Venue:
Data Mining and Knowledge Discovery
Year:
2007

Citing 0
Cited 12

Clustering high dimensional data: A graph-based relaxed optimization approach

Information Sciences: an International Journal
Exploring the power of heuristics and links in multi-relational data mining

ISMIS'08 Proceedings of the 17th international conference on Foundations of intelligent systems
Mining knowledge from databases: an information network analysis approach

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
ClustCube: an OLAP-based framework for clustering and mining complex database objects

Proceedings of the 2011 ACM Symposium on Applied Computing
Potential role based entity matching for dataspaces search

WISE'10 Proceedings of the 11th international conference on Web information systems engineering
OLAP over continuous domains via density-based hierarchical clustering

KES'11 Proceedings of the 15th international conference on Knowledge-based and intelligent information and engineering systems - Volume Part II
Multi-relational data semi-supervised K-means clustering algorithm

AICI'11 Proceedings of the Third international conference on Artificial intelligence and computational intelligence - Volume Part I
Generalized Adjusted Rand Indices for cluster ensembles

Pattern Recognition
Integrating meta-path selection with user-guided object clustering in heterogeneous information networks

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Enhanced clustering of complex database objects in the clustcube framework

Proceedings of the fifteenth international workshop on Data warehousing and OLAP
Exploiting Forum Thread Structures to Improve Thread Clustering

Proceedings of the 2013 Conference on the Theory of Information Retrieval
PathSelClus: Integrating Meta-Path Selection with User-Guided Object Clustering in Heterogeneous Information Networks

ACM Transactions on Knowledge Discovery from Data (TKDD) - Special Issue on ACM SIGKDD 2012

Quantified Score

Hi-index	0.00

Visualization

Abstract

Most structured data in real-life applications are stored in relational databases containing multiple semantically linked relations. Unlike clustering in a single table, when clustering objects in relational databases there are usually a large number of features conveying very different semantic information, and using all features indiscriminately is unlikely to generate meaningful results. Because the user knows her goal of clustering, we propose a new approach called CrossClus, which performs multi-relational clustering under user's guidance. Unlike semi-supervised clustering which requires the user to provide a training set, we minimize the user's effort by using a very simple form of user guidance. The user is only required to select one or a small set of features that are pertinent to the clustering goal, and CrossClus searches for other pertinent features in multiple relations. Each feature is evaluated by whether it clusters objects in a similar way with the user specified features. We design efficient and accurate approaches for both feature selection and object clustering. Our comprehensive experiments demonstrate the effectiveness and scalability of CrossClus.