Introduction to algorithms
MapReduce: simplified data processing on large clusters
Communications of the ACM - 50th anniversary issue: 1958 - 2008
Faster join-projects and sparse matrix multiplications
Proceedings of the 12th International Conference on Database Theory
PEGASUS: A Peta-Scale Graph Mining System Implementation and Observations
ICDM '09 Proceedings of the 2009 Ninth IEEE International Conference on Data Mining
SPARQL basic graph pattern processing with iterative MapReduce
Proceedings of the 2010 Workshop on Massive Data Analytics on the Cloud
A comparison of join algorithms for log processing in MaPreduce
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
HAMA: An Efficient Matrix Computation with the MapReduce Framework
CLOUDCOM '10 Proceedings of the 2010 IEEE Second International Conference on Cloud Computing Technology and Science
SystemML: Declarative machine learning on MapReduce
ICDE '11 Proceedings of the 2011 IEEE 27th International Conference on Data Engineering
Optimizing Multiway Joins in a Map-Reduce Environment
IEEE Transactions on Knowledge and Data Engineering
An efficient MapReduce algorithm for counting triangles in a very large graph
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Exploiting inter-operation parallelism for matrix chain multiplication using MapReduce
The Journal of Supercomputing
Hi-index | 0.00 |
In this paper, we translate the multiplication of several matrices into a multi-way join operation among several relations. Matrix multiplication is widely used for many graph algorithms, such as those that calculate the transitive closure. These algorithms benefit from the multi-way join operation because this operation reduces the number of binary multiplications. Our implementation is based on the MapReduce framework, allowing us to provide scalable computation for large matrices. Although several papers have investigated matrix multiplication using MapReduce, this paper takes a different perspective. First, we expand the problem from binary multiplication to n-ary multiplication. For this reason, we apply the concept of parallelism, not only to an individual operation but also to the entire equation. Second, we represent a matrix as a relation consisting of (row, col, val) records and translate a multiplication into a join operation in database systems. This facilitates the efficient storage of sparse matrices, which are very common in real-world graph data, and the easy manipulation of matrices. Although this work is still in progress, we conducted a number of experiments to verify the idea. We also discuss current limitations and future works.