Array decompositions for nonuniform computational environments
Journal of Parallel and Distributed Computing
IEEE Transactions on Parallel and Distributed Systems
High Performance Cluster Computing: Architectures and Systems
High Performance Cluster Computing: Architectures and Systems
Main Memory-Based Algorithms for Efficient Parallel Aggregation for Temporal Databases
Distributed and Parallel Databases
Interpreting the data: Parallel analysis with Sawzall
Scientific Programming - Dynamic Grids and Worldwide Computing
Map-reduce-merge: simplified relational data processing on large clusters
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
A comparison of approaches to large-scale data analysis
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Pro Hadoop
Hadoop: The Definitive Guide
Accelerating MapReduce with Distributed Memory Cache
ICPADS '09 Proceedings of the 2009 15th International Conference on Parallel and Distributed Systems
Optimizing joins in a map-reduce environment
Proceedings of the 13th International Conference on Extending Database Technology
A comparison of join algorithms for log processing in MaPreduce
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
SAMR: A Self-adaptive MapReduce Scheduling Algorithm in Heterogeneous Environment
CIT '10 Proceedings of the 2010 10th IEEE International Conference on Computer and Information Technology
The performance of MapReduce: an in-depth study
Proceedings of the VLDB Endowment
Filtering: a method for solving graph problems in MapReduce
Proceedings of the twenty-third annual ACM symposium on Parallelism in algorithms and architectures
Toward Efficient and Simplified Distributed Data Intensive Computing
IEEE Transactions on Parallel and Distributed Systems
Distributed tuning of machine learning algorithms using MapReduce Clusters
Proceedings of the Third Workshop on Large Scale Data Mining: Theory and Applications
Data Replication in Data Intensive Scientific Applications with Performance Guarantee
IEEE Transactions on Parallel and Distributed Systems
MAP-JOIN-REDUCE: Toward Scalable and Efficient Data Analysis on Large Clusters
IEEE Transactions on Knowledge and Data Engineering
Resource provisioning framework for mapreduce jobs with performance goals
Middleware'11 Proceedings of the 12th ACM/IFIP/USENIX international conference on Middleware
MapReduce indexing strategies: Studying scalability and efficiency
Information Processing and Management: an International Journal
Hi-index | 0.00 |
Data-intensive applications process large volumes of data using a parallel processing method. MapReduce is a programming model designed for data-intensive applications for massive data sets and an execution framework for large-scale data processing on clusters of commodity servers. While fault tolerance, easy programming structure, and high scalability are considered strong points of MapReduce; however its configuration parameters must be fine-tuned to the specific deployment, which makes it more complex in configuration and performance. This paper explains tuning of the Hadoop configuration parameters, which directly affect MapReduce's job workflow performance under various conditions to achieve maximum performance. On the basis of the empirical data we collected, it became apparent that three main methodologies can affect the execution time of MapReduce running on cluster systems. Therefore, in this paper, we present a model that consists of three main modules: (1) Extending a data redistribution technique in order to find the high-performance nodes, (2) Utilizing the number of map/reduce slots in order to make it more efficient in terms of execution time, and (3) Developing a new hybrid routing schedule shuffle phase in order to define the scheduler task while memory management level is reduced. © 2013 Wiley Periodicals, Inc.