Information marginalization on subgraphs

Authors:
Jiayuan Huang;Tingshao Zhu;Russell Greiner;Dengyong Zhou;Dale Schuurmans
Affiliations:
University of Waterloo, Waterloo, Canada;University of Alberta, Edmonton, Canada;University of Alberta, Edmonton, Canada;NEC Laboratories America, Inc.;University of Alberta, Edmonton, Canada
Venue:
PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases
Year:
2006

Citing 8
Cited 0

Authoritative sources in a hyperlinked environment

Journal of the ACM (JACM)
Co-clustering documents and words using bipartite spectral graph partitioning

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Bipartite graph partitioning and data clustering

Proceedings of the tenth international conference on Information and knowledge management
Iterative Double Clustering for Unsupervised and Semi-supervised Learning

EMCL '01 Proceedings of the 12th European Conference on Machine Learning
Information-theoretic co-clustering

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Consistent bipartite graph co-partitioning for star-structured high-order heterogeneous data co-clustering

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Multi-way distributional clustering via pairwise interactions

ICML '05 Proceedings of the 22nd international conference on Machine learning
Learning from labeled and unlabeled data on a directed graph

ICML '05 Proceedings of the 22nd international conference on Machine learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

Real-world data often involves objects that exhibit multiple relationships; for example, ‘papers' and ‘authors' exhibit both paper-author interactions and paper-paper citation relationships. A typical learning problem requires one to make inferences about a subclass of objects (e.g. ‘papers'), while using the remaining objects and relations to provide relevant information. We present a simple, unified mechanism for incorporating information from multiple object types and relations when learning on a targeted subset. In this scheme, all sources of relevant information are marginalized onto the target subclass via random walks. We show that marginalized random walks can be used as a general technique for combining multiple sources of information in relational data. With this approach, we formulate new algorithms for transduction and ranking in relational data, and quantify the performance of new schemes on real world data—achieving good results in many problems.