Picture tags and world knowledge: learning tag relations from visual semantic sources

Authors:
Lexing Xie;Xuming He
Affiliations:
Australian National University and NICTA, Canberra, Australia;NICTA and Australian National University, Canberra, Australia
Venue:
Proceedings of the 21st ACM international conference on Multimedia
Year:
2013

Citing 37
Cited 0

Using the QR factorization and group inversion to compute, differentiate ,and estimate the sensitivity of stationary probabilities for markov chains

SIAM Journal on Algebraic and Discrete Methods
Matching words and pictures

The Journal of Machine Learning Research
Image region entropy: a measure of "visualness" of web images associated with one concept

Proceedings of the 13th annual ACM international conference on Multimedia
Google's PageRank and Beyond: The Science of Search Engine Rankings

Google's PageRank and Beyond: The Science of Search Engine Rankings
Large-Scale Concept Ontology for Multimedia

IEEE MultiMedia
Fast Random Walk with Restart and Its Applications

ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Why we tag: motivations for annotation in mobile and online media

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Yago: a core of semantic knowledge

Proceedings of the 16th international conference on World Wide Web
Towards automatic extraction of event and place semantics from flickr tags

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Correlative multi-label video annotation

Proceedings of the 15th international conference on Multimedia
How flickr helps us make sense of the world: context and content in community-contributed media collections

Proceedings of the 15th international conference on Multimedia
Flickr tag recommendation based on collective knowledge

Proceedings of the 17th international conference on World Wide Web
Can all tags be used for search?

Proceedings of the 17th ACM conference on Information and knowledge management
Resolving tag ambiguity

MM '08 Proceedings of the 16th ACM international conference on Multimedia
Classifying tags using open content resources

Proceedings of the Second ACM International Conference on Web Search and Data Mining
Matchbox: large scale online bayesian recommendations

Proceedings of the 18th international conference on World wide web
Tag ranking

Proceedings of the 18th international conference on World wide web
Matrix Factorization Techniques for Recommender Systems

Computer
Visual tag dictionary: interpreting tags with visual words

WSMC '09 Proceedings of the 1st workshop on Web-scale multimedia corpus
Inferring semantic concepts from community-contributed images and noisy tags

MM '09 Proceedings of the 17th ACM international conference on Multimedia
NUS-WIDE: a real-world web image database from National University of Singapore

Proceedings of the ACM International Conference on Image and Video Retrieval
Learning social tag relevance by neighbor voting

IEEE Transactions on Multimedia
DBpedia: a nucleus for a web of open data

ISWC'07/ASWC'07 Proceedings of the 6th international The semantic web and 2nd Asian conference on Asian semantic web conference
Evaluating Color Descriptors for Object and Scene Recognition

IEEE Transactions on Pattern Analysis and Machine Intelligence
Supervised random walks: predicting and recommending links in social networks

Proceedings of the fourth ACM international conference on Web search and data mining
Towards semantic knowledge propagation from text corpus to web images

Proceedings of the 20th international conference on World wide web
Remark on “algorithm 778: L-BFGS-B: Fortran subroutines for large-scale bound constrained optimization”

ACM Transactions on Mathematical Software (TOMS)
A Tree-Based Context Model for Object Recognition

IEEE Transactions on Pattern Analysis and Machine Intelligence
Inferring Networks of Diffusion and Influence

ACM Transactions on Knowledge Discovery from Data (TKDD)
Understanding web images by object relation network

Proceedings of the 21st international conference on World Wide Web
Visual and semantic similarity in ImageNet

CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition
WSABIE: scaling up to large vocabulary image annotation

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
A Thousand Words in a Scene

IEEE Transactions on Pattern Analysis and Machine Intelligence
Understanding and predicting importance in images

CVPR '12 Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Detecting visual text

NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Towards indexing representative images on the web

Proceedings of the 20th ACM international conference on Multimedia
Towards measuring the visualness of a concept

Proceedings of the 21st ACM international conference on Information and knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper studies the use of everyday words to describe images. The common saying has it that 'a picture is worth a thousand words', here we ask which thousand? The proliferation of tagged social multimedia data presents a challenge to understanding collective tag-use at large scale -- one can ask if patterns from photo tags help understand tag-tag relations, and how it can be leveraged to improve visual search and recognition. We propose a new method to jointly analyze three distinct visual knowledge resources: Flickr, ImageNet/WordNet, and ConceptNet. This allows us to quantify the visual relevance of both tags learn their relationships. We propose a novel network estimation algorithm, Inverse Concept Rank, to infer incomplete tag relationships. We then design an algorithm for image annotation that takes into account both image and tag features. We analyze over 5 million photos with over 20,000 visual tags. The statistics from this collection leads to good results for image tagging, relationship estimation, and generalizing to unseen tags. This is a first step in analyzing picture tags and everyday semantic knowledge. Potential other applications include generating natural language descriptions of pictures, as well as validating and supplementing knowledge databases.