Scalability, portability and predictability: the BSP approach to parallel programming
Future Generation Computer Systems - Special issue: parallel computing applications
An Introduction to the Conjugate Gradient Method Without the Agonizing Pain
An Introduction to the Conjugate Gradient Method Without the Agonizing Pain
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Monte Carlo Statistical Methods (Springer Texts in Statistics)
Monte Carlo Statistical Methods (Springer Texts in Statistics)
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Evaluating MapReduce for Multi-core and Multiprocessor Systems
HPCA '07 Proceedings of the 2007 IEEE 13th International Symposium on High Performance Computer Architecture
Monte Carlo methods for matrix computations on the grid
Future Generation Computer Systems
Mobile web services mediation framework
Proceedings of the 2nd workshop on Middleware for service oriented computing: held at the ACM/IFIP/USENIX International Middleware Conference
Future Generation Computer Systems
Graph Twiddling in a MapReduce World
Computing in Science and Engineering
Mobile hosts in enterprise service integration
International Journal of Web Engineering and Technology
Introduction to web services architecture
IBM Systems Journal
Pregel: a system for large-scale graph processing
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Scalable Mobile Web Services Mediation Framework
ICIW '10 Proceedings of the 2010 Fifth International Conference on Internet and Web Applications and Services
SciCloud: Scientific Computing on the Cloud
CCGRID '10 Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing
Twister: a runtime for iterative MapReduce
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
Spark: cluster computing with working sets
HotCloud'10 Proceedings of the 2nd USENIX conference on Hot topics in cloud computing
Future Generation Computer Systems
Scheduling mapreduce jobs in HPC clusters
Euro-Par'12 Proceedings of the 18th international conference on Parallel Processing
Cloud MapReduce for Monte Carlo bootstrap applied to Metabolic Flux Analysis
Future Generation Computer Systems
Future Generation Computer Systems
Performance evaluation of parallel strategies in public clouds: A study with phylogenomic workflows
Future Generation Computer Systems
Future Generation Computer Systems
Clustering on the cloud: reducing CLARA to MapReduce
Proceedings of the Second Nordic Symposium on Cloud Computing & Internet Technologies
Rapid processing of remote sensing images based on cloud computing
Future Generation Computer Systems
A MapReduce-based indoor visual localization system using affine invariant features
Computers and Electrical Engineering
NEWT - A Fault Tolerant BSP Framework on Hadoop YARN
UCC '13 Proceedings of the 2013 IEEE/ACM 6th International Conference on Utility and Cloud Computing
Hi-index | 0.00 |
Cloud computing, with its promise of virtually infinite resources, seems to suit well in solving resource greedy scientific computing problems. To study this, we established a scientific computing cloud (SciCloud) project and environment on our internal clusters. The main goal of the project is to study the scope of establishing private clouds at the universities. With these clouds, students and researchers can efficiently use the already existing resources of university computer networks, in solving computationally intensive scientific, mathematical, and academic problems. However, to be able to run the scientific computing applications on the cloud infrastructure, the applications must be reduced to frameworks that can successfully exploit the cloud resources, like the MapReduce framework. This paper summarizes the challenges associated with reducing iterative algorithms to the MapReduce model. Algorithms used by scientific computing are divided into different classes by how they can be adapted to the MapReduce model; examples from each such class are reduced to the MapReduce model and their performance is measured and analyzed. The study mainly focuses on the Hadoop MapReduce framework but also compares it to an alternative MapReduce framework called Twister, which is specifically designed for iterative algorithms. The analysis shows that Hadoop MapReduce has significant trouble with iterative problems while it suits well for embarrassingly parallel problems, and that Twister can handle iterative problems much more efficiently. This work shows how to adapt algorithms from each class into the MapReduce model, what affects the efficiency and scalability of algorithms in each class and allows us to judge which framework is more efficient for each of them, by mapping the advantages and disadvantages of the two frameworks. This study is of significant importance for scientific computing as it often uses complex iterative methods to solve critical problems and adapting such methods to cloud computing frameworks is not a trivial task.