Relation strength-aware clustering of heterogeneous information networks with incomplete attributes

Authors:
Yizhou Sun;Charu C. Aggarwal;Jiawei Han
Affiliations:
University of Illinois at Urbana-Champaign, Urbana, IL;IBM T. J. Watson Research Center, Yorktown Heights, NY;University of Illinois at Urbana-Champaign, Urbana, IL
Venue:
Proceedings of the VLDB Endowment
Year:
2012

Citing 21
Cited 6

Algorithms for clustering data

Algorithms for clustering data
Cluster ensembles --- a knowledge reuse framework for combining multiple partitions

The Journal of Machine Learning Research
A probabilistic framework for semi-supervised clustering

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Group formation in large social networks: membership, growth, and evolution

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Evolutionary clustering

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Evolutionary spectral clustering by incorporating temporal smoothness

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
A probabilistic framework for relational clustering

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
A spectral clustering approach to optimally combining numericalvectors with a modular network

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
A tutorial on spectral clustering

Statistics and Computing
Topic modeling with network regularization

Proceedings of the 17th international conference on World Wide Web
Statistical properties of community structure in large social and information networks

Proceedings of the 17th international conference on World Wide Web
Mixed Membership Stochastic Blockmodels

The Journal of Machine Learning Research
Statistical Language Models for Information Retrieval

Statistical Language Models for Information Retrieval
Heterogeneous source consensus learning via decision propagation and negotiation

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Ranking-based clustering of heterogeneous information networks with star network schema

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Combining link and content for community detection: a discriminative approach

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Probabilistic classification and clustering in relational data

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
iTopicModel: Information Network-Integrated Topic Modeling

ICDM '09 Proceedings of the 2009 Ninth IEEE International Conference on Data Mining
A particle-and-density based evolutionary clustering method for dynamic networks

Proceedings of the VLDB Endowment
Graph clustering based on structural/attribute similarities

Proceedings of the VLDB Endowment
Probabilistic latent semantic analysis

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence

Mining heterogeneous information networks: the next frontier

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Mining knowledge from interconnected data: a heterogeneous information network analysis approach

Proceedings of the VLDB Endowment
A framework and a language for on-line analytical processing on graphs

WISE'12 Proceedings of the 13th international conference on Web Information Systems Engineering
Mining heterogeneous information networks: a structural analysis approach

ACM SIGKDD Explorations Newsletter
Social influence based clustering of heterogeneous information networks

Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Learning latent representations of nodes for classifying in heterogeneous social networks

Proceedings of the 7th ACM international conference on Web search and data mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

With the rapid development of online social media, online shopping sites and cyber-physical systems, heterogeneous information networks have become increasingly popular and content-rich over time. In many cases, such networks contain multiple types of objects and links, as well as different kinds of attributes. The clustering of these objects can provide useful insights in many applications. However, the clustering of such networks can be challenging since (a) the attribute values of objects are often incomplete, which implies that an object may carry only partial attributes or even no attributes to correctly label itself; and (b) the links of different types may carry different kinds of semantic meanings, and it is a difficult task to determine the nature of their relative importance in helping the clustering for a given purpose. In this paper, we address these challenges by proposing a model-based clustering algorithm. We design a probabilistic model which clusters the objects of different types into a common hidden space, by using a user-specified set of attributes, as well as the links from different relations. The strengths of different types of links are automatically learned, and are determined by the given purpose of clustering. An iterative algorithm is designed for solving the clustering problem, in which the strengths of different types of links and the quality of clustering results mutually enhance each other. Our experimental results on real and synthetic data sets demonstrate the effectiveness and efficiency of the algorithm.