Normalized Cuts and Image Segmentation
IEEE Transactions on Pattern Analysis and Machine Intelligence
Clustering with Instance-level Constraints
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
A unified framework for model-based clustering
The Journal of Machine Learning Research
Relationship-Based Clustering and Visualization for High-Dimensional Data Mining
INFORMS Journal on Computing
A probabilistic framework for semi-supervised clustering
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Kernel k-means: spectral clustering and normalized cuts
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Generative model-based document clustering: a comparative study
Knowledge and Information Systems
Semi-supervised graph clustering: a kernel approach
ICML '05 Proceedings of the 22nd international conference on Machine learning
Spectral Clustering in Social Networks
Advances in Web Mining and Web Usage Analysis
Proceedings of the 3rd Workshop on Social Network Mining and Analysis
An efficient block model for clustering sparse graphs
Proceedings of the Eighth Workshop on Mining and Learning with Graphs
Mining networks with shared items
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
A spectral approach to clustering numerical vectors as nodes in a network
Pattern Recognition
Relation strength-aware clustering of heterogeneous information networks with incomplete attributes
Proceedings of the VLDB Endowment
Finding itemset-sharing patterns in a large itemset-associated graph
PAKDD'10 Proceedings of the 14th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part II
Community detection in incomplete information networks
Proceedings of the 21st international conference on World Wide Web
Mining coherent subgraphs in multi-layer graphs with edge labels
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
RMiCS: a robust approach for mining coherent subgraphs in edge-labeled multi-layer graphs
Proceedings of the 25th International Conference on Scientific and Statistical Database Management
Social influence based clustering of heterogeneous information networks
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Finding contexts of social influence in online social networks
Proceedings of the 7th Workshop on Social Network Mining and Analysis
Hi-index | 0.00 |
We address the issue of clustering numerical vectors with a network. The problem setting is basically equivalent to constrained clustering by Wagstaff and Cardie and semi-supervised clustering by Basu et al., but our focus is more on the optimal combination of two heterogeneous data sources. An application of this setting is web pages which can be numerically vectorized by their contents, e.g. term frequencies, and which are hyperlinked to each other, showing a network. Another typical application is genes whose behavior can be numerically measured and a gene network can be given from another data source.We first define a new graph clustering measure which we call normalized network modularity, by balancing the cluster size of the original modularity. We then propose a new clustering method which integrates the cost of clustering numerical vectors with the cost of maximizing the normalized network modularity into a spectral relaxation problem. Our learning algorithm is based on spectral clustering which makes our issue an eigenvalue problem and uses k-means for final cluster assignments. A significant advantage of our method is that we can optimize the weight parameter for balancing the two costs from the given data by choosing the minimum total cost. We evaluated the performance of our proposed method using a variety of datasets including synthetic data as well as real-world data from molecular biology. Experimental results showed that our method is effective enough to have good results for clustering by numerical vectors and a network.