Naive evaluation of recursively defined relations
On knowledge base management systems: integrating artificial intelligence and d atabase technologies
Communications of the ACM
A bridging model for parallel computation
Communications of the ACM
A framework for the parallel processing of Datalog queries
SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
ACM Transactions on Computer Systems (TOCS)
The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Space/time trade-offs in hash coding with allowable errors
Communications of the ACM
Principles of Database and Knowledge-Base Systems: Volume II: The New Technologies
Principles of Database and Knowledge-Base Systems: Volume II: The New Technologies
Data Partition and Parallel Evaluation of Datalog Programs
IEEE Transactions on Knowledge and Data Engineering
Delta-Stepping: A Parallel Single Source Shortest Path Algorithm
ESA '98 Proceedings of the 6th Annual European Symposium on Algorithms
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Cloning-based context-sensitive pointer alias analysis using binary decision diagrams
Proceedings of the ACM SIGPLAN 2004 conference on Programming language design and implementation
Information Processing Letters
The link-prediction problem for social networks
Journal of the American Society for Information Science and Technology
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Dryad: distributed data-parallel programs from sequential building blocks
Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007
Pig latin: a not-so-foreign language for data processing
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Communications of the ACM - Scratch Programming for All
A Language for Large Ensembles of Independently Executing Nodes
ICLP '09 Proceedings of the 25th International Conference on Logic Programming
Boom analytics: exploring data-centric, declarative programming for the cloud
Proceedings of the 5th European conference on Computer systems
Pregel: a system for large-scale graph processing
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
HaLoop: efficient iterative data processing on large clusters
Proceedings of the VLDB Endowment
Dedalus: datalog in time and space
Datalog'10 Proceedings of the First international conference on Datalog Reloaded
REX: recursive, delta-based data-centric computation
Proceedings of the VLDB Endowment
SociaLite: Datalog extensions for efficient social network analysis
ICDE '13 Proceedings of the 2013 IEEE International Conference on Data Engineering (ICDE 2013)
Hi-index | 0.00 |
Large-scale graph analysis is becoming important with the rise of world-wide social network services. Recently in SociaLite, we proposed extensions to Datalog to efficiently and succinctly implement graph analysis programs on sequential machines. This paper describes novel extensions and optimizations of SociaLite for parallel and distributed executions to support large-scale graph analysis. With distributed SociaLite, programmers simply annotate how data are to be distributed, then the necessary communication is automatically inferred to generate parallel code for cluster of multi-core machines. It optimizes the evaluation of recursive monotone aggregate functions using a delta stepping technique. In addition, approximate computation is supported in SociaLite, allowing programmers to trade off accuracy for less time and space. We evaluated SociaLite with six core graph algorithms used in many social network analyses. Our experiment with 64 Amazon EC2 8-core instances shows that SociaLite programs performed within a factor of two with respect to ideal weak scaling. Compared to optimized Giraph, an open-source alternative of Pregel, SociaLite programs are 4 to 12 times faster across benchmark algorithms, and 22 times more succinct on average. As a declarative query language, SociaLite, with the help of a compiler that generates efficient parallel and approximate code, can be used easily to create many social apps that operate on large-scale distributed graphs.