Algorithms for clustering data
Algorithms for clustering data
Optimal algorithms for approximate clustering
STOC '88 Proceedings of the twentieth annual ACM symposium on Theory of computing
Approximation algorithms for geometric median problems
Information Processing Letters
OPTICS: ordering points to identify the clustering structure
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Approximation algorithms for min-sum p-clustering
Discrete Applied Mathematics
Normalized Cuts and Image Segmentation
IEEE Transactions on Pattern Analysis and Machine Intelligence
A clustering algorithm based on graph connectivity
Information Processing Letters
Approximating min-sum k-clustering in metric spaces
STOC '01 Proceedings of the thirty-third annual ACM symposium on Theory of computing
A constant-factor approximation algorithm for the k-median problem
Journal of Computer and System Sciences - STOC 1999
Efficient and Effective Clustering Methods for Spatial Data Mining
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
ROCK: A Robust Clustering Algorithm for Categorical Attributes
ICDE '99 Proceedings of the 15th International Conference on Data Engineering
Clustering to minimize the sum of cluster diameters
Journal of Computer and System Sciences - STOC 2001
A probabilistic framework for semi-supervised clustering
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Semi-supervised graph clustering: a kernel approach
ICML '05 Proceedings of the 22nd international conference on Machine learning
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Weighted Graph Cuts without Eigenvectors A Multilevel Approach
IEEE Transactions on Pattern Analysis and Machine Intelligence
Probabilistic classification and clustering in relational data
IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
An O(pn2) algorithm for the p -median and related problems on tree graphs
Operations Research Letters
Agglomerative genetic algorithm for clustering in social networks
Proceedings of the 11th Annual conference on Genetic and evolutionary computation
Automatic Choice of Control Measurements
ACML '09 Proceedings of the 1st Asian Conference on Machine Learning: Advances in Machine Learning
On community outliers and their efficient detection in information networks
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Clustering with internal connectedness
WALCOM'11 Proceedings of the 5th international conference on WALCOM: algorithms and computation
Pattern change discovery between high dimensional data sets
Proceedings of the 20th ACM international conference on Information and knowledge management
Mining attribute-structure correlated patterns in large attributed graphs
Proceedings of the VLDB Endowment
Community detection in incomplete information networks
Proceedings of the 21st international conference on World Wide Web
Finding collections of k-clique percolated components in attributed graphs
PAKDD'12 Proceedings of the 16th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part II
Cascade-based community detection
Proceedings of the sixth ACM international conference on Web search and data mining
Combining Relations and Text in Scientific Network Clustering
ASONAM '12 Proceedings of the 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012)
Hi-index | 0.00 |
Attribute data and relationship data are two principal types of data, representing the intrinsic and extrinsic properties of entities. While attribute data have been the main source of data for cluster analysis, relationship data such as social networks or metabolic networks are becoming increasingly available. It is also common to observe both data types carry complementary information such as in market segmentation and community identification, which calls for a joint cluster analysis of both data types so as to achieve better results. In this article, we introduce the novel Connected k-Center (CkC) problem, a clustering model taking into account attribute data as well as relationship data. We analyze the complexity of the problem and prove its NP-hardness. Therefore, we analyze the approximability of the problem and also present a constant factor approximation algorithm. For the special case of the CkC problem where the relationship data form a tree structure, we propose a dynamic programming method giving an optimal solution in polynomial time. We further present NetScan, a heuristic algorithm that is efficient and effective for large real databases. Our extensive experimental evaluation on real datasets demonstrates the meaningfulness and accuracy of the NetScan results.