SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Pig latin: a not-so-foreign language for data processing
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Hive: a warehousing solution over a map-reduce framework
Proceedings of the VLDB Endowment
Benchmarking cloud serving systems with YCSB
Proceedings of the 1st ACM symposium on Cloud computing
Feeding frenzy: selectively materializing users' event feeds
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing
NSDI'12 Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation
The little engine(s) that could: scaling online social networks
IEEE/ACM Transactions on Networking (TON)
Proceedings of the 13th international conference on Web Information Systems Engineering
WISE'12 Proceedings of the 13th international conference on Web Information Systems Engineering
WISE'12 Proceedings of the 13th international conference on Web Information Systems Engineering
Acolyte: an in-memory social network query system
WISE'12 Proceedings of the 13th international conference on Web Information Systems Engineering
Accelerating queries over microblog dataset via grouping and indexing techniques
WISE'12 Proceedings of the 13th international conference on Web Information Systems Engineering
A fast and high throughput SQL query system for big data
WISE'12 Proceedings of the 13th international conference on Web Information Systems Engineering
Hi-index | 0.00 |
Social media analytics has many applications in collective behavior sensing and monitoring, online advertisement, opinion mining, and etc. Though a number of technologies and systems are proposed for analyzing social media data, the overall performance and the advantages of those technologies and systems are not compared under similar settings. In this paper, a benchmark named as BSMA, for Benchmarking Social Media Analytics, is proposed. It distinguishes with other similar effort in that: 1) A real-life dataset with activties of more than 1.6 million users in 2 years and followship relationships of 1.2 billion users is used. The distributions of data in the dataset is different from those of data generators. 2) 19 queries fitting into three categories, i.e. social network quries, hotspot queries, and timeline queries, are used. The three categories each poses challenge to different part of testing systems. 3) Measurements of throughput, latency, and scalability are used for testing performance. A toolkit for reporting measurement values that are based on YCSB is developed. A previous version of BSMA is used in WISE 2012 Challenge. Four teams implemented all or part of the 19 queries. Their results are analyzed in this paper. The progress and future work of BSMA is also discussed.