Comparing the performance of group detection algorithm in serial and parallel processing environments

Authors:
Channing Brown;Iftekhar Ahmed;Dora Cai;Marshall Scott Poole;Andrew Pilny;Yannick Atouba
Affiliations:
University of Illinois at Urbana, Champaign, MC, Urbana, IL;University of Illinois at Urbana, Champaign, MC, Urbana, IL;University of Illinois at Urbana, Champaign, MC, Urbana, IL;University of Illinois at Urbana, Champaign, W. Oregon, Urbana, IL;University of Illinois at Urbana, Champaign, W. Oregon, Urbana, IL;University of Illinois at Urbana, Champaign, W. Oregon, Urbana, IL
Venue:
Proceedings of the 1st Conference of the Extreme Science and Engineering Discovery Environment: Bridging from the eXtreme to the campus and beyond
Year:
2012

Citing 2
Cited 1

The social side of gaming: a study of interaction patterns in a massively multiplayer online game

CSCW '04 Proceedings of the 2004 ACM conference on Computer supported cooperative work
Constructing social networks from unstructured group dialog in virtual worlds

SBP'11 Proceedings of the 4th international conference on Social computing, behavioral-cultural modeling and prediction

SocialMapExplorer: visualizing social networks of massively multiplayer online games in temporal-geographic space

Proceedings of the Conference on Extreme Science and Engineering Discovery Environment: Gateway to Discovery

Quantified Score

Hi-index	0.00

Visualization

Abstract

Developing an algorithm for group identification from a collection of individuals without grouping data has been getting significant attention because of the need for increased understanding of groups and teams in online environments. This study used space, time, task, and players' virtual behavioral indicators from a game database to develop an algorithm to detect groups over time. The group detection algorithm was primarily developed for a serial processing environment and later then modified to allow for parallel processing on Gordon. For a collection of data representing 192 days of game play (approximately 140 gigabytes of log data), the computation required 266 minutes for the major steps of the analysis when running on a single processor. The same computation required 25 minutes when running on Gordon with 16 processors. The provision of massive compute nodes and the rich shared memory environment on Gordon has improved the performance of our analysis by a factor of 11. Besides demonstrating the possibility to save time and effort, this study also highlights some lessons learned for transforming a serial detection algorithm to parallel environments.