Mining future spatiotemporal events and their sentiment from online news articles for location-aware recommendation system

  • Authors:
  • Shen-Shyang Ho;Mike Lieberman;Pu Wang;Hanan Samet

  • Affiliations:
  • Nanyang Technological University, Singapore;University of Maryland, College Park, MD;Google, Inc., Mountain View, CA;University of Maryland, College Park, MD

  • Venue:
  • Proceedings of the First ACM SIGSPATIAL International Workshop on Mobile Geographic Information Systems
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

The future-related information mining task for online web resources such as news articles and blogs has been getting more attention due to its potential usefulness in supporting individual's decision making in a world where massive new data are generated daily. Instead of building a data-driven model to predict the future, one extracts future events from these massive data with high probability that they occur at a future time and a specific geographic location. Such spatiotemporal future events can be utilized by a recommender system on a location-aware device to provide localized future event suggestions. In this paper, we describe a systematic approach for mining future spatiotemporal events from web; in particular, news articles. In our application context, a valid event is defined both spatially and temporally. The mining procedure consists of two main steps: recognition and matching. For the recognition step, we identify and resolve toponyms (geographic location) and future temporal patterns. In the matching step, we perform spatiotemporal disambiguation, de-duplication, and pairing. To provide more useful future event guidance, we attach to each event a sentiment linguistic variable: positive, negative, or neutral, so that one may use these extracted event information for recommendation purposes in the form of "avoid Event A" or "avoid geographic location L at time T" or "attend Event B" based on the event sentiment. The identified future event consists of its geographic location, temporal pattern, sentiment variable, news title, key phrase, and news article URL. Experimental results on 3652 news articles from 21 online new sources collected over a 2-week period in the Greater Washington area are used to illustrate some of the critical steps in our mining procedure.