Exploiting Foursquare and Cellular Data to Infer User Activity in Urban Environments

Authors:
Anastasios Noulas;Cecilia Mascolo;Enrique Frias-Martinez
Affiliations:
-;-;-
Venue:
MDM '13 Proceedings of the 2013 IEEE 14th International Conference on Mobile Data Management - Volume 01
Year:
2013

Citing 0
Cited 1

On the importance of temporal dynamics in modeling urban activity

Proceedings of the 2nd ACM SIGKDD International Workshop on Urban Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Inferring the type of activities in neighborhoods of urban centers may be helpful in a number of contexts including urban planning, content delivery and activity recommendations for mobile web users or may even yield to a deeper understanding of the geographical evolution of social life in the city . During the past few years, the analysis of mobile phone usage patterns, or of social media with longitudinal attributes, have aided the automatic characterization of the dynamics of the urban environment. In this work, we combine a dataset sourced from a telecommunication provider in Spain with a database of millions of geo-tagged venues from Foursquare and we formulate the problem of urban activity inference in a supervised learning framework. In particular, we exploit user communication patterns observed at the base station level in order to predict the activity of Foursquare users who checkin-in at nearby venues. First, we mine a set of machine learning features that allow us to encode the input telecommunication signal of a tower. Subsequently, we evaluate a diverse set of supervised learning algorithms using labels extracted from Foursquare place categories and we consider two application scenarios. Initially, we assess how hard it is to predict specific urban activity of an area, showing that Nightlife and Entertainment spots are those easier to infer, whereas College and Shopping areas are those featuring the lowest accuracy rates. Then, considering a candidate set of activity types in a geographic area, we aim to elect the most prominent one. We demonstrate how the difficulty of the problem increases with the number of classes incorporated in the prediction task, yet the classifiers achieve a considerably better performance compared to a random guess even when the set of candidate classes increases.