A game theoretic framework for heterogenous information network clustering

Authors:
Faris Alqadah;Raj Bhatnagar
Affiliations:
Johns Hopkins University, Baltimore, MD, USA;University of Cincinnati, Cincinnati, OH, USA
Venue:
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Year:
2011

Citing 24
Cited 0

Co-clustering documents and words using bipartite spectral graph partitioning

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Formal Concept Analysis: Mathematical Foundations

Formal Concept Analysis: Mathematical Foundations
A Monte Carlo algorithm for fast projective clustering

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Biclustering of Expression Data

Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology
Information-theoretic co-clustering

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
A cross-collection mixture model for comparative text mining

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Biclustering Algorithms for Biological Data Analysis: A Survey

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Cross-relational clustering with user's guidance

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Relational clustering for multi-type entity resolution

MRDM '05 Proceedings of the 4th international workshop on Multi-relational mining
Multi-way distributional clustering via pairwise interactions

ICML '05 Proceedings of the 22nd international conference on Machine learning
Introducing Game Theory and its Applications

Introducing Game Theory and its Applications
Spectral clustering for multi-type relational data

ICML '06 Proceedings of the 23rd international conference on Machine learning
Unsupervised learning on k-partite graphs

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
LinkClus: efficient clustering via heterogeneous semantic links

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Mining Maximal Quasi-Bicliques to Co-Cluster Stocks and Financial Ratios for Value Investment

ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Clicks: An effective algorithm for mining subspace clusters in categorical datasets

Data & Knowledge Engineering
Maximal Biclique Subgraphs and Closed Pattern Pairs of the Adjacency Matrix: A One-to-One Correspondence and Mining Algorithms

IEEE Transactions on Knowledge and Data Engineering
Biclustering in data mining

Computers and Operations Research
An effective algorithm for mining 3-clusters in vertically partitioned data

Proceedings of the 17th ACM conference on Information and knowledge management
RankClus: integrating clustering with ranking for heterogeneous information network analysis

Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
A comparison of extrinsic clustering evaluation metrics based on formal constraints

Information Retrieval
Ranking-based clustering of heterogeneous information networks with star network schema

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Simple search methods for finding a Nash equilibrium

AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Towards fault-tolerant formal concept analysis

AI*IA'05 Proceedings of the 9th conference on Advances in Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Heterogeneous information networks are pervasive in applications ranging from bioinformatics to e-commerce. As a result, unsupervised learning and clustering methods pertaining to such networks have gained significant attention recently. Nodes in a heterogeneous information network are regarded as objects derived from distinct domains such as 'authors' and 'papers'. In many cases, feature sets characterizing the objects are not available, hence, clustering of the objects depends solely on the links and relationships amongst objects. Although several previous studies have addressed information network clustering, shortcomings remain. First, the definition of what constitutes an information network cluster varies drastically from study to study. Second, previous algorithms have generally focused on non-overlapping clusters, while many algorithms are also limited to specific network topologies. In this paper we introduce a game theoretic framework (GHIN) for defining and mining clusters in heterogeneous information networks. The clustering problem is modeled as a game wherein each domain represents a player and clusters are defined as the Nash equilibrium points of the game. Adopting the abstraction of Nash equilibrium points as clusters allows for flexible definition of reward functions that characterize clusters without any modification to the underlying algorithm. We prove that well-established definitions of clusters in 2-domain information networks such as formal concepts, maximal bi-cliques, and noisy binary tiles can always be represented as Nash equilibrium points. Moreover, experimental results employing a variety of reward functions and several real world information networks illustrate that the GHIN framework produces more accurate and informative clusters than the recently proposed NetClus and state of the art MDC algorithms.