An Empirical Study on the Relationship between the Followers' Number and Influence of Microblogging
ICEE '10 Proceedings of the 2010 International Conference on E-Business and E-Government
Microblogging after a major disaster in China: a case study of the 2010 Yushu earthquake
Proceedings of the ACM 2011 conference on Computer supported cooperative work
How about micro-blogging service in China: analysis and mining on sina micro-blog
Proceedings of 1st international symposium on From digital footprints to social and community intelligence
The pattern of information diffusion in microblog
Proceedings of The ACM CoNEXT Student Workshop
Sina Microblog: An Information-Driven Online Social Network
CW '11 Proceedings of the 2011 International Conference on Cyberworlds
Mining the interests of Chinese microbloggers via keyword extraction
Frontiers of Computer Science in China
Discovering collective viewpoints on micro-blogging events based on community and temporal aspects
ADMA'11 Proceedings of the 7th international conference on Advanced Data Mining and Applications - Volume Part I
On benchmarking online social media analytical queries
First International Workshop on Graph Data Management Experiences and Systems
Hi-index | 0.00 |
Sina Weibo is currently the microblogging web service with the highest number of registered users in China. As in any large social network, the relationship representation is so huge that executing queries over the network is a very challenging problem. The WISE 2012 conference proposed a challenge based on Sina Weibo with two tracks: performance testing and repost prediction. This paper focuses on the first track challenge, which goal is to implement 19 queries with the highest throughput and the lowest latency, using a scalable parallel paradigm. In the input database, there are 265 millions of relations among more than 60 millions of users and more than 400 millions sent messages. This paper formalizes the logical model of the relationship in order to present the queries in precise and simple manner. Some optimization techniques were also proposed, such as the aggregate-rank-delete procedures, which can be applied to some of the queries for improving the performance. The proposed model and optimizations were implemented in a very scalable parallel system and the experimental results show that our solution can obtain high throughput and low latency for most of the queries.