Latent feature encoding using dyadic and relational data

Authors:
Shin Ando
Affiliations:
Gunma University, Kiryu, Japan
Venue:
Proceedings of the 20th ACM international conference on Information and knowledge management
Year:
2011

Citing 7
Cited 0

Document clustering based on non-negative matrix factorization

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Latent dirichlet allocation

The Journal of Machine Learning Research
Clustering Needles in a Haystack: An Information Theoretic Analysis of Minority and Outlier Detection

ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining
Modeling hidden topics on document manifold

Proceedings of the 17th ACM conference on Information and knowledge management
Probabilistic dyadic data analysis with local and global consistency

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Topic-link LDA: joint models of topic and author community

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Detection of unique temporal segments by information theoretic meta-clustering

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

Learning from dyadic and relational data is a fundamental problem for IR and KDD applications in web and social media domain. Basic behaviors and characteristics of users and documents are typically described by a collection of dyads, i.e., pairs of entities. Discriminative features extracted from such data are essential in exploratory and discriminatory analyses. Relational properties of the entities reflect pair-wise similarities and their collective community structure which are also valuable for discriminative learning. A challenging aspect of learning from the relational data in many domains, is that the generative process of relational links appears noisy and is not well described by a stochastic model. In this paper, we present a principled approach for learning discriminative features from heterogeneous sources of dyadic and relational data. We propose an information-theoretic framework called Latent Feature Encoding (LFE) which projects the entities and the links to a latent feature space in the analogy of -encoding. Projection is formalized as a maximization of the mutual information preserved in the latent features, regularized by the compression rate of encoding. The regularization is emphasized over more probable links to account for the noisiness of the observation. An empirical evaluation of the proposed method using text and social media datasets is presented. Performances in supervised and unsupervised learning tasks are compared with those of conventional latent feature extraction methods.