A deterministic annealing approach to clustering
Pattern Recognition Letters
The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Distributed PageRank computation based on iterative aggregation-disaggregation methods
Proceedings of the 14th ACM international conference on Information and knowledge management
Dryad: distributed data-parallel programs from sequential building blocks
Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007
Evaluating MapReduce for Multi-core and Multiprocessor Systems
HPCA '07 Proceedings of the 2007 IEEE 13th International Symposium on High Performance Computer Architecture
MapReduce: simplified data processing on large clusters
Communications of the ACM - 50th anniversary issue: 1958 - 2008
MapReduce for Data Intensive Scientific Analyses
ESCIENCE '08 Proceedings of the 2008 Fourth IEEE International Conference on eScience
Proceedings of the ACM/IFIP/USENIX 2003 International Conference on Middleware
All-Pairs: An Abstraction for Data-Intensive Computing on Campus Grids
IEEE Transactions on Parallel and Distributed Systems
DryadLINQ for Scientific Analyses
E-SCIENCE '09 Proceedings of the 2009 Fifth IEEE International Conference on e-Science
OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
Large-scale incremental processing using distributed transactions and notifications
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
Mesos: a platform for fine-grained resource sharing in the data center
Proceedings of the 8th USENIX conference on Networked systems design and implementation
Adapting MapReduce for HPC environments
Proceedings of the 20th international symposium on High performance distributed computing
PrIter: a distributed framework for prioritized iterative computations
Proceedings of the 2nd ACM Symposium on Cloud Computing
Scaling the mobile millennium system in the cloud
Proceedings of the 2nd ACM Symposium on Cloud Computing
A distributed look-up architecture for text mining applications using MapReduce
Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
Benchmarking MapReduce Implementations for Application Usage Scenarios
GRID '11 Proceedings of the 2011 IEEE/ACM 12th International Conference on Grid Computing
Data intensive applications on clouds
Proceedings of the second international workshop on Data intensive computing in the clouds
Design patterns for scientific applications in DryadLINQ CTP
Proceedings of the second international workshop on Data intensive computing in the clouds
Parallel data processing with MapReduce: a survey
ACM SIGMOD Record
Generalizing mapreduce as a unified cloud and HPC runtime
Proceedings of the 2nd international workshop on Petascal data analytics: challenges and opportunities
Provenance for MapReduce-based data-intensive workflows
Proceedings of the 6th workshop on Workflows in support of large-scale science
Riding the elephant: managing ensembles with hadoop
Proceedings of the 2011 ACM international workshop on Many task computing on grids and supercomputers
The HaLoop approach to large-scale iterative data analysis
The VLDB Journal — The International Journal on Very Large Data Bases
iMapReduce: A Distributed Computing Framework for Iterative Computation
Journal of Grid Computing
Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing
NSDI'12 Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation
A service-oriented taxonomical spectrum, cloudy challenges and opportunities of cloud computing
International Journal of Communication Systems
P2P-MapReduce: Parallel data processing in dynamic Cloud environments
Journal of Computer and System Sciences
Panacea: towards holistic optimization of MapReduce applications
Proceedings of the Tenth International Symposium on Code Generation and Optimization
MapIterativeReduce: a framework for reduction-intensive data processing on azure clouds
Proceedings of third international workshop on MapReduce and its Applications Date
Pilot-MapReduce: an extensible and flexible MapReduce implementation for distributed data
Proceedings of third international workshop on MapReduce and its Applications Date
Scalable regression tree learning on Hadoop using OpenPlanet
Proceedings of third international workshop on MapReduce and its Applications Date
Accelerate large-scale iterative computation through asynchronous accumulative updates
Proceedings of the 3rd workshop on Scientific Cloud Computing Date
Adapting scientific computing problems to clouds using MapReduce
Future Generation Computer Systems
MARLA: MapReduce for Heterogeneous Clusters
CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
SciMATE: A Novel MapReduce-Like Framework for Multiple Scientific Data Formats
CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
Proceedings of the 1st Conference of the Extreme Science and Engineering Discovery Environment: Bridging from the eXtreme to the campus and beyond
Petri nets state space analysis in the cloud
Proceedings of the 34th International Conference on Software Engineering
MapReduce for parallel reinforcement learning
EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
The seven deadly sins of cloud computing research
HotCloud'12 Proceedings of the 4th USENIX conference on Hot Topics in Cloud Ccomputing
Spinning fast iterative data flows
Proceedings of the VLDB Endowment
REX: recursive, delta-based data-centric computation
Proceedings of the VLDB Endowment
Parallel rough set based knowledge acquisition using MapReduce from big data
Proceedings of the 1st International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications
Distributed formal concept analysis algorithms based on an iterative mapreduce framework
ICFCA'12 Proceedings of the 10th international conference on Formal Concept Analysis
M3R: increased performance for in-memory Hadoop jobs
Proceedings of the VLDB Endowment
Proceedings of the ACM Conference on Bioinformatics, Computational Biology and Biomedicine
PowerGraph: distributed graph-parallel computation on natural graphs
OSDI'12 Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation
Large scale data analytics on clouds
Proceedings of the fourth international workshop on Cloud data management
Enabling cloud interoperability with COMPSs
Euro-Par'12 Proceedings of the 18th international conference on Parallel Processing
Scheduling mapreduce jobs in HPC clusters
Euro-Par'12 Proceedings of the 18th international conference on Parallel Processing
Optimizing large-scale Semi-Naïve datalog evaluation in hadoop
Datalog 2.0'12 Proceedings of the Second international conference on Datalog in Academia and Industry
Cogset: a high performance MapReduce engine
Concurrency and Computation: Practice & Experience
MapReduce-Based data stream processing over large history data
ICSOC'12 Proceedings of the 10th international conference on Service-Oriented Computing
SemanMR: big data processing framework based on semantics
Proceedings of the Fourth Asia-Pacific Symposium on Internetware
Scalable parallel computing on clouds using Twister4Azure iterative MapReduce
Future Generation Computer Systems
Iterative statistical kernels on contemporary GPUs
International Journal of Computational Science and Engineering
Turbine: a distributed-memory dataflow engine for extreme-scale many-task applications
Proceedings of the 1st ACM SIGMOD Workshop on Scalable Workflow Execution Engines and Technologies
Tiled-MapReduce: Efficient and Flexible MapReduce Processing on Multicore with Tiling
ACM Transactions on Architecture and Code Optimization (TACO)
Sparkler: supporting large-scale matrix factorization
Proceedings of the 16th International Conference on Extending Database Technology
Provenance from log files: a BigData problem
Proceedings of the Joint EDBT/ICDT 2013 Workshops
Efficient analytics on ordered datasets using MapReduce
Proceedings of the 22nd international symposium on High-performance parallel and distributed computing
Presto: distributed machine learning and graph processing with sparse matrices
Proceedings of the 8th ACM European Conference on Computer Systems
Visualizing the protein sequence universe
Proceedings of the 3rd international workshop on Emerging computational methods for the life sciences
HyMR: a hybrid MapReduce workflow system
Proceedings of the 3rd international workshop on Emerging computational methods for the life sciences
XSEDE-enabled high-throughput lesion activity assessment
Proceedings of the Conference on Extreme Science and Engineering Discovery Environment: Gateway to Discovery
GPS: a graph processing system
Proceedings of the 25th International Conference on Scientific and Statistical Database Management
Big data analytics with small footprint: squaring the cloud
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Performance comparison under failures of MPI and MapReduce: An analytical approach
Future Generation Computer Systems
Large-scale computation not at the cost of expressiveness
HotOS'13 Proceedings of the 14th USENIX conference on Hot Topics in Operating Systems
Mammoth: autonomic data processing framework for scientific state-transition applications
Proceedings of the 2013 ACM Cloud and Autonomic Computing Conference
Cloud MapReduce for particle filter-based data assimilation for wildfire spread simulation
Proceedings of the High Performance Computing Symposium
i2MapReduce: incremental iterative MapReduce
Proceedings of the 2nd International Workshop on Cloud Intelligence
Distributed data management using MapReduce
ACM Computing Surveys (CSUR)
Parallelizing the execution of sequential scripts
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Prolog programming with a map-reduce parallel construct
Proceedings of the 15th Symposium on Principles and Practice of Declarative Programming
Banking on decoupling: budget-driven sustainability for HPC applications on auction-based clouds
ACM SIGOPS Operating Systems Review
Data-Intensive Cloud Computing: Requirements, Expectations, Challenges, and Solutions
Journal of Grid Computing
Clustering on the cloud: reducing CLARA to MapReduce
Proceedings of the Second Nordic Symposium on Cloud Computing & Internet Technologies
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
ACM SIGOPS 24th Symposium on Operating Systems Principles
Dandelion: a compiler and runtime for heterogeneous systems
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
The family of mapreduce and large-scale data processing systems
ACM Computing Surveys (CSUR)
Combination of in-memory graph computation with mapreduce: a subgraph-centric method of pagerank
WAIM'13 Proceedings of the 14th international conference on Web-Age Information Management
High performance clustering of social images in a map-collective programming model
Proceedings of the 4th annual Symposium on Cloud Computing
Gunther: search-based auto-tuning of mapreduce
Euro-Par'13 Proceedings of the 19th international conference on Parallel Processing
MR-runner: a modularized map-reduce job management tool
Proceedings of the 5th Asia-Pacific Symposium on Internetware
Active data: a data-centric approach to data life-cycle management
PDSW '13 Proceedings of the 8th Parallel Data Storage Workshop
Random walks based modularity: application to semi-supervised learning
Proceedings of the 23rd international conference on World wide web
A Scalable Distributed Framework for Efficient Analytics on Ordered Datasets
UCC '13 Proceedings of the 2013 IEEE/ACM 6th International Conference on Utility and Cloud Computing
International Journal of Approximate Reasoning
Turbine: A Distributed-memory Dataflow Engine for High Performance Many-task Applications
Fundamenta Informaticae - Scalable Workflow Enactment Engines and Technology
Distributed media indexing based on MPI and MapReduce
Multimedia Tools and Applications
Hi-index | 0.00 |
MapReduce programming model has simplified the implementation of many data parallel applications. The simplicity of the programming model and the quality of services provided by many implementations of MapReduce attract a lot of enthusiasm among distributed computing communities. From the years of experience in applying MapReduce to various scientific applications we identified a set of extensions to the programming model and improvements to its architecture that will expand the applicability of MapReduce to more classes of applications. In this paper, we present the programming model and the architecture of Twister an enhanced MapReduce runtime that supports iterative MapReduce computations efficiently. We also show performance comparisons of Twister with other similar runtimes such as Hadoop and DryadLINQ for large scale data parallel applications.