Cross-media topic mining on wikipedia

Authors:
Xikui Wang;Yang Liu;Donghui Wang;Fei Wu
Affiliations:
College of Computer Science, Zhejiang University, Hangzhou, China;College of Computer Science, Zhejiang University, Hangzhou, China;College of Computer Science, Zhejiang University, Hangzhou, China;College of Computer Science, Zhejiang University, Hangzhou, China
Venue:
Proceedings of the 21st ACM international conference on Multimedia
Year:
2013

Citing 8
Cited 0

Modeling annotated data

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Enhancing text clustering by leveraging Wikipedia semantics

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Learning to link with wikipedia

Proceedings of the 17th ACM conference on Information and knowledge management
Ranking with local regression and global alignment for cross media retrieval

MM '09 Proceedings of the 17th ACM international conference on Multimedia
Online Learning for Matrix Factorization and Sparse Coding

The Journal of Machine Learning Research
A new approach to cross-modal multimedia retrieval

Proceedings of the international conference on Multimedia
Crew: cross-modal resource searching by exploiting wikipedia

Proceedings of the international conference on Multimedia
Generalized Multiview Analysis: A discriminative latent space

CVPR '12 Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Quantified Score

Hi-index	0.00

Visualization

Abstract

As a collaborative wiki-based encyclopedia, Wikipedia provides a huge amount of articles of various categories. In addition to their text corpus, Wikipedia also contains plenty of images which makes the articles more intuitive for readers to understand. To better organize these visual and textual data, one promising area of research is to jointly model the embedding topics across multi-modal data (i.e, cross-media) from Wikipedia. In this work, we propose to learn the projection matrices that map the data from heterogeneous feature spaces into a unified latent topic space. Different from previous approaches, by imposing the l1 regularizers to the projection matrices, only a small number of relevant visual/textual words are associated with each topic, which makes our model more interpretable and robust. Furthermore, the correlations of Wikipedia data in different modalities are explicitly considered in our model. The effectiveness of the proposed topic extraction algorithm is verified by several experiments conducted on real Wikipedia datasets.