Prediction of favourite photos using social, visual, and textual signals

Authors:
Roelof van Zwol;Adam Rae;Lluis Garcia Pueyo
Affiliations:
Yahoo! Research, Santa Clara, CA, USA;Open University, Milton Keynes, United Kingdom;Yahoo! Research, Barcelona, Spain
Venue:
Proceedings of the international conference on Multimedia
Year:
2010

Citing 7
Cited 4

VISCORS: A Visual-Content Recommender for the Mobile Web

IEEE Intelligent Systems
A regression framework for learning ranking functions using relative relevance judgments

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Flickr: Who is Looking?

WI '07 Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence
Flickr tag recommendation based on collective knowledge

Proceedings of the 17th international conference on World Wide Web
Personalized, interactive tag recommendation for flickr

Proceedings of the 2008 ACM conference on Recommender systems
Ranking and classifying attractiveness of photos in folksonomies

Proceedings of the 18th international conference on World wide web
Stochastic gradient boosted distributed decision trees

Proceedings of the 18th ACM conference on Information and knowledge management

Leveraging user comments for aesthetic aware image search reranking

Proceedings of the 21st international conference on World Wide Web
Multimedia features for click prediction of new ads in display advertising

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Image ranking based on user browsing behavior

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Rare is interesting: connecting spatio-temporal behavior patterns with subjective image appeal

Proceedings of the 2nd ACM international workshop on Geotagging and its applications in multimedia

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper focuses on the prediction of users' favourite photos in Flickr. We propose a multi-modal, machine learned approach that combines social, visual and textual signals into a single prediction system. Although each individual user has different motivations for calling a photo a favourite, we show that the textual, visual, and social modalities effectively capture the needs of most active Flickr users. We use gradient-boosted decision trees (GBDT) with a mod least squares loss function for the classification of a user's favourite photos, and evaluate the performance of our classifier with respect to the individual modalities and various combinations thereof. By using a combination of the social and visual modalities the GBDT creates a highly effective classifier. The addition of textual features allows us to significantly increase recall, with a slight trade off in precision.