Massive Social Network Analysis: Mining Twitter for Social Good

Authors:
David Ediger;Karl Jiang;Jason Riedy;David A. Bader;Courtney Corley
Affiliations:
-;-;-;-;-
Venue:
ICPP '10 Proceedings of the 2010 39th International Conference on Parallel Processing
Year:
2010

Citing 0
Cited 7

Extracting semantic knowledge from twitter

ePart'11 Proceedings of the Third IFIP WG 8.5 international conference on Electronic participation
Thought leaders during crises in massive social networks

Statistical Analysis and Data Mining
Introducing ScaleGraph: an X10 library for billion scale graph analytics

Proceedings of the 2012 ACM SIGPLAN X10 Workshop
Betweenness centrality: algorithms and implementations

Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming
A statistical framework for streaming graph analysis

Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
Envisioning a future for a spatial-health CyberGIS marketplace

Proceedings of the Second ACM SIGSPATIAL International Workshop on the Use of GIS in Public Health
A new benchmark dataset with production methodology for short text semantic similarity algorithms

ACM Transactions on Speech and Language Processing (TSLP)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Social networks produce an enormous quantity of data. Facebook consists of over 400 million active users sharing over 5 billion pieces of information each month. Analyzing this vast quantity of unstructured data presents challenges for software and hardware. We present GraphCT, a Graph Characterization Toolkit for massive graphs representing social network data. On a 128-processor Cray XMT, GraphCT estimates the betweenness centrality of an artificially generated (R-MAT) 537 million vertex, 8.6 billion edge graph in 55 minutes and a real-world graph (Kwak, et al.) with 61.6 million vertices and 1.47 billion edges in 105 minutes. We use GraphCT to analyze public data from Twitter, a microblogging network. Twitter's message connections appear primarily tree-structured as a news dissemination system. Within the public data, however, are clusters of conversations. Using GraphCT, we can rank actors within these conversations and help analysts focus attention on a much smaller data subset.