Multimodal geo-tagging in social media websites using hierarchical spatial segmentation

Authors:
Pascal Kelm;Sebastian Schmiedeke;Thomas Sikora
Affiliations:
Technische Universität Berlin, Germany;Technische Universität Berlin, Germany;Technische Universität Berlin, Germany
Venue:
Proceedings of the 5th ACM SIGSPATIAL International Workshop on Location-Based Social Networks
Year:
2012

Citing 13
Cited 0

Gabor Analysis and Algorithms: Theory and Applications

Gabor Analysis and Algorithms: Theory and Applications
Scalable Color Image Indexing and Retrieval Using Vector Wavelets

IEEE Transactions on Knowledge and Data Engineering
Image Indexing Using Color Correlograms

CVPR '97 Proceedings of the 1997 Conference on Computer Vision and Pattern Recognition (CVPR '97)
Lire: lucene image retrieval: an extensible java CBIR library

MM '08 Proceedings of the 16th ACM international conference on Multimedia
Mapping the world's photos

Proceedings of the 18th international conference on World wide web
An agenda for the next generation gazetteer: geographic information contribution and retrieval

Proceedings of the 17th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems
CEDD: color and edge directivity descriptor: a compact descriptor for image indexing and retrieval

ICVS'08 Proceedings of the 6th international conference on Computer vision systems
Multi-source toponym data integration and mediation for a meta-gazetteer service

GIScience'10 Proceedings of the 6th international conference on Geographic information science
Multi-modal, multi-resource methods for placing Flickr videos on the map

Proceedings of the 1st ACM International Conference on Multimedia Retrieval
A hierarchical, multi-modal approach for placing videos on the map using millions of Flickr photographs

SBNMA '11 Proceedings of the 2011 ACM workshop on Social and behavioural networked media access
Color and texture descriptors

IEEE Transactions on Circuits and Systems for Video Technology
A visual approach for video geocoding using bag-of-scenes

Proceedings of the 2nd ACM International Conference on Multimedia Retrieval
Multimodal Location Estimation of Consumer Media: Dealing with Sparse Training Data

ICME '12 Proceedings of the 2012 IEEE International Conference on Multimedia and Expo

Quantified Score

Hi-index	0.00

Visualization

Abstract

These days the sharing of photographs and videos is very popular in social networks. Many of these social media websites such as Flickr, Facebook and Youtube allows the user to manually label their uploaded videos with geo-information using a interface for dragging them into the map. However, the manually labelling for a large set of social media is still borring and error-prone. For this reason we present a hierarchical, multi-modal approach for estimating the GPS information. Our approach makes use of external resources like gazetteers to extract toponyms in the metadata and of visual and textual features to identify similar content. First, the national borders detection recognizes the country and its dimension to speed up the estimation and to eliminate geographical ambiguity. Next, we use a database of more than 3.2 million Flickr images to group them together into geographical regions and to build a hierarchical model. A fusion of visual and textual methods for different granularities is used to classify the videos' location into possible regions. The Flickr videos are tagged with the geo-information of the most similar training image within the regions that is previously filtered by the probabilistic model for each test video. In comparison with existing GPS estimation and image retrieval approaches at the Placing Task 2011 we will show the effectiveness and high accuracy relative to the state-of-the art solutions.