Performance debugging for distributed systems of black boxes
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
MapReduce: simplified data processing on large clusters
Communications of the ACM - 50th anniversary issue: 1958 - 2008
Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling
Proceedings of the 5th European conference on Computer systems
An Analysis of Traces from a Production MapReduce Cluster
CCGRID '10 Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing
The performance of MapReduce: an in-depth study
Proceedings of the VLDB Endowment
Diagnosing performance changes by comparing request flows
Proceedings of the 8th USENIX conference on Networked systems design and implementation
Profiling network performance for multi-tier data center applications
Proceedings of the 8th USENIX conference on Networked systems design and implementation
A hadoop-based packet trace processing tool
TMA'11 Proceedings of the Third international conference on Traffic monitoring and analysis
The Case for Evaluating MapReduce Performance Using Workload Suites
MASCOTS '11 Proceedings of the 2011 IEEE 19th Annual International Symposium on Modelling, Analysis, and Simulation of Computer and Telecommunication Systems
Parallel data processing with MapReduce: a survey
ACM SIGMOD Record
A Survey on Internet Traffic Identification
IEEE Communications Surveys & Tutorials
Structured comparative analysis of systems logs to diagnose performance problems
NSDI'12 Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation
Coupling scheduler for MapReduce/Hadoop
Proceedings of the 21st international symposium on High-Performance Parallel and Distributed Computing
Modeling I/O Interference in Data Intensive Map-Reduce Applications
SAINT '12 Proceedings of the 2012 IEEE/IPSJ 12th International Symposium on Applications and the Internet
Measuring Distributed Applications through MapReduce and Traffic Analysis
ICPADS '12 Proceedings of the 2012 IEEE 18th International Conference on Parallel and Distributed Systems
Workload Characteristic Oriented Scheduler for MapReduce
ICPADS '12 Proceedings of the 2012 IEEE 18th International Conference on Parallel and Distributed Systems
Hi-index | 0.00 |
The use of MapReduce for distributed data processing has been growing and achieving benefits with its application for different workloads. MapReduce can be used for distributed traffic analysis, although network traces present characteristics which are not similar to the data type commonly processed through MapReduce. Motivated by the use of MapReduce for profiling application traffic and due to the lack of evaluation of MapReduce for network traffic analysis and the peculiarity of this kind of data, this paper evaluates the performance of MapReduce in packet level analysis and DPI, analysing its scalability, speed-up, and the behavior of MapReduce phases. The experiments provide evidences for the predominant phases in this kind of job, and show the impact of input size, block size and number of nodes, on MapReduce completion time and scalability.