User demographics prediction based on mobile data

  • Authors:
  • Erheng Zhong;Ben Tan;Kaixiang Mo;Qiang Yang

  • Affiliations:
  • -;-;-;-

  • Venue:
  • Pervasive and Mobile Computing
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Demographics prediction is an important component of user profile modeling. The accurate prediction of users' demographics can help promote many applications, ranging from web search, personalization to behavior targeting. In this paper, we focus on how to predict users' demographics, including ''gender'', ''job type'', ''marital status'', ''age'' and ''number of family members'', based on mobile data, such as users' usage logs, physical activities and environmental contexts. The core idea is to build a supervised learning framework, where each user is represented as a feature vector and users' demographics are considered as prediction targets. The most important component is to construct features from raw data and then supervised learning models can be applied. We propose a feature construction framework, CFC (contextual feature construction), where each feature is defined as the conditional probability of one user activity under the given contexts. Consequently, besides employing standard supervised learning models, we propose a regularized multi-task learning framework to model different kinds of demographics predictions collectively. We also propose a cost-sensitive classification framework for regression tasks, in order to benefit from the existing dimension reduction methods. Finally, due to the limited training instances, we employ ensemble to avoid overfitting. The experimental results show that the framework achieves classification accuracies on ''gender'', ''job'' and ''marital status'' as high as 96%, 83% and 86%, respectively, and achieves Root Mean Square Error (RMSE) on ''age'' and ''number of family members'' as low as 0.69 and 0.66 respectively, under the leave-one-out evaluation.