Evaluating SPLASH-2 Applications Using MapReduce
APPT '09 Proceedings of the 8th International Symposium on Advanced Parallel Processing Technologies
Proceedings of the 4th Workshop on Workflows in Support of Large-Scale Science
Cloud technologies for bioinformatics applications
Proceedings of the 2nd Workshop on Many-Task Computing on Grids and Supercomputers
Proceedings of the 19th international conference on World wide web
SPARQL basic graph pattern processing with iterative MapReduce
Proceedings of the 2010 Workshop on Massive Data Analytics on the Cloud
A Map-Reduce System with an Alternate API for Multi-core Environments
CCGRID '10 Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing
MRAP: a novel MapReduce-based framework to support HPC analytics applications with access patterns
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
Twister: a runtime for iterative MapReduce
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
MapCG: writing parallel program portable between CPU and GPU
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Spark: cluster computing with working sets
HotCloud'10 Proceedings of the 2nd USENIX conference on Hot topics in cloud computing
Scripting the cloud with skywriting
HotCloud'10 Proceedings of the 2nd USENIX conference on Hot topics in cloud computing
Efficient pipelined architecture for competitive learning
Journal of Parallel and Distributed Computing
HaLoop: efficient iterative data processing on large clusters
Proceedings of the VLDB Endowment
Behavioral simulations in MapReduce
Proceedings of the VLDB Endowment
Attribute reduction for massive data based on rough set theory and MapReduce
RSKT'10 Proceedings of the 5th international conference on Rough set and knowledge technology
CIEL: a universal execution engine for distributed data-flow computing
Proceedings of the 8th USENIX conference on Networked systems design and implementation
Garbage collection auto-tuning for Java mapreduce on multi-cores
Proceedings of the international symposium on Memory management
Just in time: adding value to the IO pipelines of high performance applications with JITStaging
Proceedings of the 20th international symposium on High performance distributed computing
A distributed look-up architecture for text mining applications using MapReduce
Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
An approach for processing large and non-uniform media objects on mapreduce-based clusters
ICADL'11 Proceedings of the 13th international conference on Asia-pacific digital libraries: for cultural heritage, knowledge dissemination, and future creation
Benchmarking MapReduce Implementations for Application Usage Scenarios
GRID '11 Proceedings of the 2011 IEEE/ACM 12th International Conference on Grid Computing
Evaluating the suitability of mapreduce for surface temperature analysis codes
Proceedings of the second international workshop on Data intensive computing in the clouds
Parallel data processing with MapReduce: a survey
ACM SIGMOD Record
A fully-protected large-scale email system built on map-reduce framework
GPC'10 Proceedings of the 5th international conference on Advances in Grid and Pervasive Computing
DVM: towards a datacenter-scale virtual machine
VEE '12 Proceedings of the 8th ACM SIGPLAN/SIGOPS conference on Virtual Execution Environments
A parallel method for computing rough set approximations
Information Sciences: an International Journal
iMapReduce: A Distributed Computing Framework for Iterative Computation
Journal of Grid Computing
A service-oriented taxonomical spectrum, cloudy challenges and opportunities of cloud computing
International Journal of Communication Systems
MapIterativeReduce: a framework for reduction-intensive data processing on azure clouds
Proceedings of third international workshop on MapReduce and its Applications Date
SciMATE: A Novel MapReduce-Like Framework for Multiple Scientific Data Formats
CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
MapReduce approach to collective classification for networks
ICAISC'12 Proceedings of the 11th international conference on Artificial Intelligence and Soft Computing - Volume Part I
Cloud-based image processing system with priority-based data distribution mechanism
Computer Communications
Future Generation Computer Systems
Tiled-MapReduce: Efficient and Flexible MapReduce Processing on Multicore with Tiling
ACM Transactions on Architecture and Code Optimization (TACO)
HyMR: a hybrid MapReduce workflow system
Proceedings of the 3rd international workshop on Emerging computational methods for the life sciences
Performance comparison under failures of MPI and MapReduce: An analytical approach
Future Generation Computer Systems
Parallelizing the execution of sequential scripts
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
SIDR: structure-aware intelligent data routing in Hadoop
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Achieving Accountable MapReduce in cloud computing
Future Generation Computer Systems
Parallel skyline queries over uncertain data streams in cloud computing environments
International Journal of Web and Grid Services
A MapReduce task scheduling algorithm for deadline constraints
Cluster Computing
Hi-index | 0.00 |
Most scientific data analyses comprise analyzing voluminous data collected from various instruments. Efficient parallel/concurrent algorithms and frameworks are the key to meeting the scalability and performance requirements entailed in such scientific data analyses. The recently introduced MapReduce technique has gained a lot of attention from the scientific community for its applicability in large parallel data analyses. Although there are many evaluations of the MapReduce technique using large textual data collections, there have been only a few evaluations for scientific data analyses. The goals of this paper are twofold. First, we present our experience in applying the MapReduce technique for two scientific data analyses: (i) High Energy Physics data analyses; (ii) Kmeans clustering. Second, we present CGL-MapReduce, a streaming-based MapReduce implementation and compare its performance with Hadoop.