The SR-tree: an index structure for high-dimensional nearest neighbor queries
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Using GPS to learn significant locations and predict movement across multiple users
Personal and Ubiquitous Computing
Learning transportation mode from raw gps data for geographic applications on the web
Proceedings of the 17th international conference on World Wide Web
Mining interesting locations and travel sequences from GPS trajectories
Proceedings of the 18th international conference on World wide web
Future Generation Computer Systems
Spatial Queries Evaluation with MapReduce
GCC '09 Proceedings of the 2009 Eighth International Conference on Grid and Cooperative Computing
Data warehousing and analytics infrastructure at facebook
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Hadoop in Action
Proceedings of the 2013 ACM conference on Pervasive and ubiquitous computing adjunct publication
Hi-index | 0.00 |
Mobile technology, especially mobile phone, is very popular nowadays. Increasing number of mobile users and availability of GPS-embedded mobile phones generate large amount of GPS trajectories that can be used in various research areas such as people mobility and transportation planning. However, how to handle such a large-scale dataset is a significant issue particularly in spatial analysis domain. In this paper, we aimed to explore a suitable way for extracting geo-location of GPS coordinate that achieve large-scale support, fast processing, and easily scalable both in storage and calculation speed. Geo-locations are cities, zones, or any interesting points. Our dataset is GPS trajectories of 1.5 million individual mobile phone users in Japan accumulated for one year. The total number was approximately 9.2 billion records. Therefore, we conducted performance comparisons of various methods for processing spatial data, particularly for a huge dataset. In this work, we first processed data on PostgreSQL with PostGIS that is a traditional way for spatial data processing. Second, we used java application with spatial library called Java Topology suite (JTS). Third, we tried on Hadoop Cloud Computing Platform focusing on using Hive on top of Hadoop to allow SQL-like support. However, Hadoop/Hive did not support spatial query at the moment. Hence, we proposed a solution to enable spatial support on Hive. As the results, Hadoop/hive with spatial support performed best result in large-scale processing among evaluated methods and in addition, we recommended techniques in Hadoop/Hive for processing different types of spatial data.