A bridging model for parallel computation
Communications of the ACM
MPI: a message passing interface
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Enhanced hypertext categorization using hyperlinks
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Bulk Synchronous Parallel: Practical Experience with a Model for Parallel Computing
PACT '96 Proceedings of the 1996 Conference on Parallel Architectures and Compilation Techniques
Graph Twiddling in a MapReduce World
Computing in Science and Engineering
PEGASUS: A Peta-Scale Graph Mining System Implementation and Observations
ICDM '09 Proceedings of the 2009 Ninth IEEE International Conference on Data Mining
Efficient Dense Structure Mining Using MapReduce
ICDMW '09 Proceedings of the 2009 IEEE International Conference on Data Mining Workshops
Pregel: a system for large-scale graph processing
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Large graph processing in the cloud
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Design patterns for efficient graph algorithms in MapReduce
Proceedings of the Eighth Workshop on Mining and Learning with Graphs
HaLoop: efficient iterative data processing on large clusters
Proceedings of the VLDB Endowment
HAMA: An Efficient Matrix Computation with the MapReduce Framework
CLOUDCOM '10 Proceedings of the 2010 IEEE Second International Conference on Cloud Computing Technology and Science
Hadoop: The Definitive Guide
iHadoop: Asynchronous Iterations for MapReduce
CLOUDCOM '11 Proceedings of the 2011 IEEE Third International Conference on Cloud Computing Technology and Science
Of hammers and nails: an empirical comparison of three paradigms for processing large graphs
Proceedings of the fifth ACM international conference on Web search and data mining
iMapReduce: A Distributed Computing Framework for Iterative Computation
Journal of Grid Computing
Managing and mining large graphs: systems and implementations
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Relational large scale multi-label classification method for video categorization
Multimedia Tools and Applications
ICDMW '12 Proceedings of the 2012 IEEE 12th International Conference on Data Mining Workshops
Future Generation Computer Systems
Hi-index | 0.00 |
More and more large data collections are gathered worldwide in various IT systems. Many of them possess a networked nature and need to be processed and analysed as graph structures. Due to their size they very often require the usage of a parallel paradigm for efficient computation. Three parallel techniques have been compared in the paper: MapReduce, its map-side join extension and Bulk Synchronous Parallel (BSP). They are implemented for two different graph problems: calculation of single source shortest paths (SSSP) and collective classification of graph nodes by means of relational influence propagation (RIP). The methods and algorithms are applied to several network datasets differing in size and structural profile, originating from three domains: telecommunication, multimedia and microblog. The results revealed that iterative graph processing with the BSP implementation always and significantly, even up to 10 times outperforms MapReduce, especially for algorithms with many iterations and sparse communication. The extension of MapReduce based on map-side join is usually characterized by better efficiency compared to its origin, although not as much as BSP. Nevertheless, MapReduce still remains a good alternative for enormous networks, whose data structures do not fit in local memories.