Logical model of relationship for online social networks and performance optimizing of queries: WISE 2012 challenge - T1: performance track scalability winner

Authors:
Edans F. O. De Sandes;Li Weigang;Alba Cristina M. A. de Melo
Affiliations:
University of Brasilia, Brasilia, Brazil;University of Brasilia, Brasilia, Brazil;University of Brasilia, Brasilia, Brazil
Venue:
WISE'12 Proceedings of the 13th international conference on Web Information Systems Engineering
Year:
2012

Citing 7
Cited 1

An Empirical Study on the Relationship between the Followers' Number and Influence of Microblogging

ICEE '10 Proceedings of the 2010 International Conference on E-Business and E-Government
Microblogging after a major disaster in China: a case study of the 2010 Yushu earthquake

Proceedings of the ACM 2011 conference on Computer supported cooperative work
How about micro-blogging service in China: analysis and mining on sina micro-blog

Proceedings of 1st international symposium on From digital footprints to social and community intelligence
The pattern of information diffusion in microblog

Proceedings of The ACM CoNEXT Student Workshop
Sina Microblog: An Information-Driven Online Social Network

CW '11 Proceedings of the 2011 International Conference on Cyberworlds
Mining the interests of Chinese microbloggers via keyword extraction

Frontiers of Computer Science in China
Discovering collective viewpoints on micro-blogging events based on community and temporal aspects

ADMA'11 Proceedings of the 7th international conference on Advanced Data Mining and Applications - Volume Part I

On benchmarking online social media analytical queries

First International Workshop on Graph Data Management Experiences and Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Sina Weibo is currently the microblogging web service with the highest number of registered users in China. As in any large social network, the relationship representation is so huge that executing queries over the network is a very challenging problem. The WISE 2012 conference proposed a challenge based on Sina Weibo with two tracks: performance testing and repost prediction. This paper focuses on the first track challenge, which goal is to implement 19 queries with the highest throughput and the lowest latency, using a scalable parallel paradigm. In the input database, there are 265 millions of relations among more than 60 millions of users and more than 400 millions sent messages. This paper formalizes the logical model of the relationship in order to present the queries in precise and simple manner. Some optimization techniques were also proposed, such as the aggregate-rank-delete procedures, which can be applied to some of the queries for improving the performance. The proposed model and optimizations were implemented in a very scalable parallel system and the experimental results show that our solution can obtain high throughput and low latency for most of the queries.