SIGMETRICS '98/PERFORMANCE '98 Proceedings of the 1998 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Effect of task duplication on the assignment of dependency graphs
Parallel Computing
Task assignment with unknown duration
Journal of the ACM (JACM)
Xen and the art of virtualization
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Speculative plan execution for information agents
Speculative plan execution for information agents
Task assignment in heterogeneous computing systems
Journal of Parallel and Distributed Computing
Speculative execution in a distributed file system
ACM Transactions on Computer Systems (TOCS)
Interpreting the data: Parallel analysis with Sawzall
Scientific Programming - Dynamic Grids and Worldwide Computing
AutoBash: improving configuration management with operating system causality analysis
Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
MapReduce: simplified data processing on large clusters
Communications of the ACM - 50th anniversary issue: 1958 - 2008
Pig latin: a not-so-foreign language for data processing
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Future Generation Computer Systems
Supporting MapReduce on large-scale asymmetric multi-core clusters
ACM SIGOPS Operating Systems Review
CLOUDLET: towards mapreduce implementation on virtual machines
Proceedings of the 18th ACM international symposium on High performance distributed computing
Making cluster applications energy-aware
ACDC '09 Proceedings of the 1st workshop on Automated control for datacenters and clouds
MapReduce optimization using regulated dynamic prioritization
Proceedings of the eleventh international joint conference on Measurement and modeling of computer systems
Open-source grid technologies for web-scale computing
ACM SIGACT News
A Vision for Next Generation Query Processors and an Associated Research Agenda
Globe '09 Proceedings of the 2nd International Conference on Data Management in Grid and Peer-to-Peer Systems
Quincy: fair scheduling for distributed computing clusters
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
Packing the most onto your cloud
Proceedings of the first international workshop on Cloud data management
GridBot: execution of bags of tasks in multiple grids
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
MapReduce System over Heterogeneous Mobile Devices
SEUS '09 Proceedings of the 7th IFIP WG 10.2 International Workshop on Software Technologies for Embedded and Ubiquitous Systems
DisTec: Towards a Distributed System for Telecom Computing
CloudCom '09 Proceedings of the 1st International Conference on Cloud Computing
Evaluating MapReduce on Virtual Machines: The Hadoop Case
CloudCom '09 Proceedings of the 1st International Conference on Cloud Computing
Boom analytics: exploring data-centric, declarative programming for the cloud
Proceedings of the 5th European conference on Computer systems
ParaTimer: a progress indicator for MapReduce DAGs
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Assigning tasks for efficiency in Hadoop: extended abstract
Proceedings of the twenty-second annual ACM symposium on Parallelism in algorithms and architectures
Designing Accelerator-Based Distributed Systems for High Performance
CCGRID '10 Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing
A Capabilities-Aware Programming Model for Asymmetric High-End Systems
CCGRID '10 Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing
A Map-Reduce System with an Alternate API for Multi-core Environments
CCGRID '10 Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing
MOON: MapReduce On Opportunistic eNvironments
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
MR-scope: a real-time tracing tool for MapReduce
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
Tiled-MapReduce: optimizing resource usages of data-parallel applications on multicore with tiling
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
On availability of intermediate data in cloud computations
HotOS'09 Proceedings of the 12th conference on Hot topics in operating systems
Experiences with CoralCDN: a five-year operational view
NSDI'10 Proceedings of the 7th USENIX conference on Networked systems design and implementation
The utility coprocessor: massively parallel computation from the coffee shop
USENIXATC'10 Proceedings of the 2010 USENIX conference on USENIX annual technical conference
Manimal: relational optimization for data-intensive programs
Procceedings of the 13th International Workshop on the Web and Databases
See spot run: using spot instances for mapreduce workflows
HotCloud'10 Proceedings of the 2nd USENIX conference on Hot topics in cloud computing
On the feasibility of dynamic rescheduling on the Intel Distributed Computing Platform
Proceedings of the 11th International Middleware Conference Industrial track
A capabilities-aware framework for using computational accelerators in data-intensive computing
Journal of Parallel and Distributed Computing
The performance of MapReduce: an in-depth study
Proceedings of the VLDB Endowment
Reining in the outliers in map-reduce clusters using Mantri
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
Chukwa: a system for reliable large-scale log collection
LISA'10 Proceedings of the 24th international conference on Large installation system administration
Dynamic proportional share scheduling in Hadoop
JSSPP'10 Proceedings of the 15th international conference on Job scheduling strategies for parallel processing
Variable-sized map and locality-aware reduce on public-resource grids
Future Generation Computer Systems
Mesos: a platform for fine-grained resource sharing in the data center
Proceedings of the 8th USENIX conference on Networked systems design and implementation
Sharing the data center network
Proceedings of the 8th USENIX conference on Networked systems design and implementation
Automatic optimization for MapReduce programs
Proceedings of the VLDB Endowment
Proceedings of the First Workshop on Building Analysis Datasets and Gathering Experience Returns for Security
A load-aware scheduler for MapReduce framework in heterogeneous cloud environments
Proceedings of the 2011 ACM Symposium on Applied Computing
Towards improved load balancing for data intensive distributed computing
Proceedings of the 2011 ACM Symposium on Applied Computing
A hadoop-based packet trace processing tool
TMA'11 Proceedings of the Third international conference on Traffic monitoring and analysis
On scheduling in map-reduce and flow-shops
Proceedings of the twenty-third annual ACM symposium on Parallelism in algorithms and architectures
Non-deterministic parallelism considered useful
HotOS'13 Proceedings of the 13th USENIX conference on Hot topics in operating systems
Exploring MapReduce efficiency with highly-distributed data
Proceedings of the second international workshop on MapReduce and its applications
Enhancement of Xen's scheduler for MapReduce workloads
Proceedings of the 20th international symposium on High performance distributed computing
ARIA: automatic resource inference and allocation for mapreduce environments
Proceedings of the 8th ACM international conference on Autonomic computing
Towards predictable datacenter networks
Proceedings of the ACM SIGCOMM 2011 conference
FLEX: a slot allocation scheduling optimizer for MapReduce workloads
Proceedings of the ACM/IFIP/USENIX 11th International Conference on Middleware
Energy proportionality and performance in data parallel computing clusters
SSDBM'11 Proceedings of the 23rd international conference on Scientific and statistical database management
Incoop: MapReduce for incremental computations
Proceedings of the 2nd ACM Symposium on Cloud Computing
PrIter: a distributed framework for prioritized iterative computations
Proceedings of the 2nd ACM Symposium on Cloud Computing
Verifiable resource accounting for cloud computing services
Proceedings of the 3rd ACM workshop on Cloud computing security workshop
Purlieus: locality-aware resource allocation for MapReduce in a cloud
Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
On the duality of data-intensive file system design: reconciling HDFS and PVFS
Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
A Load-Driven Task Scheduler with Adaptive DSC for MapReduce
GREENCOM '11 Proceedings of the 2011 IEEE/ACM International Conference on Green Computing and Communications
Performance evaluation of MapReduce using full virtualisation on a departmental cloud
International Journal of Applied Mathematics and Computer Science - SPECIAL SECTION: Efficient Resource Management for Grid-Enabled Applications
The price is right: towards location-independent costs in datacenters
Proceedings of the 10th ACM Workshop on Hot Topics in Networks
Benchmarking MapReduce Implementations for Application Usage Scenarios
GRID '11 Proceedings of the 2011 IEEE/ACM 12th International Conference on Grid Computing
GRID '11 Proceedings of the 2011 IEEE/ACM 12th International Conference on Grid Computing
Evaluating the suitability of mapreduce for surface temperature analysis codes
Proceedings of the second international workshop on Data intensive computing in the clouds
Energy efficient scheduling of MapReduce workloads on heterogeneous clusters
Green Computing Middleware on Proceedings of the 2nd International Workshop
Parallel data processing with MapReduce: a survey
ACM SIGMOD Record
MATE-EC2: a middleware for processing data with AWS
Proceedings of the 2011 ACM international workshop on Many task computing on grids and supercomputers
Variable-Sized map and locality-aware reduce on public-resource grids
GPC'10 Proceedings of the 5th international conference on Advances in Grid and Pervasive Computing
Mitigating the negative impact of preemption on heterogeneous MapReduce workloads
Proceedings of the 7th International Conference on Network and Services Management
Tarazu: optimizing MapReduce on heterogeneous clusters
ASPLOS XVII Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems
Improving Hadoop performance in intercloud environments
ACM SIGMETRICS Performance Evaluation Review
ACM SIGMETRICS Performance Evaluation Review
HotCloud'11 Proceedings of the 3rd USENIX conference on Hot topics in cloud computing
Resource provisioning framework for mapreduce jobs with performance goals
Middleware'11 Proceedings of the 12th ACM/IFIP/USENIX international conference on Middleware
Resource-aware adaptive scheduling for mapreduce clusters
Middleware'11 Proceedings of the 12th ACM/IFIP/USENIX international conference on Middleware
iMapReduce: A Distributed Computing Framework for Iterative Computation
Journal of Grid Computing
SkewTune: mitigating skew in mapreduce applications
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Synergy2cloud: introducing cross-sharing of application experiences into the cloud management cycle
Hot-ICE'12 Proceedings of the 2nd USENIX conference on Hot Topics in Management of Internet, Cloud, and Enterprise Networks and Services
Orchestrating the deployment of computations in the cloud with conductor
NSDI'12 Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation
A service-oriented taxonomical spectrum, cloudy challenges and opportunities of cloud computing
International Journal of Communication Systems
P2P-MapReduce: Parallel data processing in dynamic Cloud environments
Journal of Computer and System Sciences
Pricing cloud bandwidth reservations under demand uncertainty
Proceedings of the 12th ACM SIGMETRICS/PERFORMANCE joint international conference on Measurement and Modeling of Computer Systems
Computing resource prediction for mapreduce applications using decision tree
APWeb'12 Proceedings of the 14th Asia-Pacific international conference on Web Technologies and Applications
Halt or continue: estimating progress of queries in the cloud
DASFAA'12 Proceedings of the 17th international conference on Database Systems for Advanced Applications - Volume Part II
Investigation of data locality and fairness in MapReduce
Proceedings of third international workshop on MapReduce and its Applications Date
Locality-aware dynamic VM reconfiguration on MapReduce clouds
Proceedings of the 21st international symposium on High-Performance Parallel and Distributed Computing
SpeQuloS: a QoS service for BoT applications using best effort distributed computing infrastructures
Proceedings of the 21st international symposium on High-Performance Parallel and Distributed Computing
Understanding the effects and implications of compute node related failures in hadoop
Proceedings of the 21st international symposium on High-Performance Parallel and Distributed Computing
Proceedings of the 21st international symposium on High-Performance Parallel and Distributed Computing
Time and Cost Sensitive Data-Intensive Computing on Hybrid Clouds
CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
MARLA: MapReduce for Heterogeneous Clusters
CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
Improving MapReduce Performance in Heterogeneous Network Environments and Resource Utilization
CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
Investigation of Data Locality in MapReduce
CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
Maestro: Replica-Aware Map Scheduling for MapReduce
CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
MapReduce Workload Modeling with Statistical Approach
Journal of Grid Computing
Reliable MapReduce computing on opportunistic resources
Cluster Computing
Programming your network at run-time for big data applications
Proceedings of the first workshop on Hot topics in software defined networks
The seven deadly sins of cloud computing research
HotCloud'12 Proceedings of the 4th USENIX conference on Hot Topics in Cloud Ccomputing
Why let resources idle? aggressive cloning of jobs with dolly
HotCloud'12 Proceedings of the 4th USENIX conference on Hot Topics in Cloud Ccomputing
Predicting execution bottlenecks in map-reduce clusters
HotCloud'12 Proceedings of the 4th USENIX conference on Hot Topics in Cloud Ccomputing
CASH: context aware scheduler for Hadoop
Proceedings of the International Conference on Advances in Computing, Communications and Informatics
On scheduling dag s for volatile computing platforms: Area-maximizing schedules
Journal of Parallel and Distributed Computing
SkewTune in action: mitigating skew in MapReduce applications
Proceedings of the VLDB Endowment
Efficient big data processing in Hadoop MapReduce
Proceedings of the VLDB Endowment
AROMA: automated resource allocation and configuration of mapreduce environment in the cloud
Proceedings of the 9th international conference on Autonomic computing
Automatic task slots assignment in Hadoop MapReduce
Proceedings of the 1st Workshop on Architectures and Systems for Big Data
Hierarchical merge for scalable MapReduce
Proceedings of the 2012 workshop on Management of big data systems
Performance isolation and fairness for multi-tenant cloud storage
OSDI'12 Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation
Multimedia Applications and Security in MapReduce: Opportunities and Challenges
Concurrency and Computation: Practice & Experience
More is less: reducing latency via redundancy
Proceedings of the 11th ACM Workshop on Hot Topics in Networks
Improving large graph processing on partitioned graphs in the cloud
Proceedings of the Third ACM Symposium on Cloud Computing
Themis: an I/O-efficient MapReduce
Proceedings of the Third ACM Symposium on Cloud Computing
More for your money: exploiting performance heterogeneity in public clouds
Proceedings of the Third ACM Symposium on Cloud Computing
True elasticity in multi-tenant data-intensive compute clusters
Proceedings of the Third ACM Symposium on Cloud Computing
Designing good algorithms for MapReduce and beyond
Proceedings of the Third ACM Symposium on Cloud Computing
On modelling and prediction of total CPU usage for applications in mapreduce environments
ICA3PP'12 Proceedings of the 12th international conference on Algorithms and Architectures for Parallel Processing - Volume Part I
Cloud MapReduce for Monte Carlo bootstrap applied to Metabolic Flux Analysis
Future Generation Computer Systems
Analysis and design of internet monitoring system on public opinion based on cloud computing and NLP
WISM'12 Proceedings of the 2012 international conference on Web Information Systems and Mining
Assessing MapReduce for Internet Computing: A Comparison of Hadoop and BitDew-MapReduce
GRID '12 Proceedings of the 2012 ACM/IEEE 13th International Conference on Grid Computing
A virtual machine consolidation framework for MapReduce enabled computing clouds
Proceedings of the 24th International Teletraffic Congress
Resource provisioning framework for MapReduce jobs with performance goals
Proceedings of the 12th International Middleware Conference
Resource-aware adaptive scheduling for MapReduce clusters
Proceedings of the 12th International Middleware Conference
Using clouds for MapReduce measurement assignments
ACM Transactions on Computing Education (TOCE)
CloudBay: Enabling an Online Resource Market Place for Open Clouds
UCC '12 Proceedings of the 2012 IEEE/ACM Fifth International Conference on Utility and Cloud Computing
A Hybrid Scheduling Algorithm for Data Intensive Workloads in a MapReduce Environment
UCC '12 Proceedings of the 2012 IEEE/ACM Fifth International Conference on Utility and Cloud Computing
Cogset: a high performance MapReduce engine
Concurrency and Computation: Practice & Experience
Fairness and isolation in multi-tenant storage as optimization decomposition
ACM SIGOPS Operating Systems Review
Tiled-MapReduce: Efficient and Flexible MapReduce Processing on Multicore with Tiling
ACM Transactions on Architecture and Code Optimization (TACO)
ClouDiA: a deployment advisor for public clouds
Proceedings of the VLDB Endowment
Breaking the MapReduce stage barrier
Cluster Computing
A study of unpredictability in fault-tolerant middleware
Computer Networks: The International Journal of Computer and Telecommunications Networking
Future Generation Computer Systems
Interference and locality-aware task scheduling for MapReduce applications in virtual clusters
Proceedings of the 22nd international symposium on High-performance parallel and distributed computing
MRSG - A MapReduce simulator over SimGrid
Parallel Computing
BlinkDB: queries with bounded errors and bounded response times on very large data
Proceedings of the 8th ACM European Conference on Computer Systems
CPI2: CPU performance isolation for shared compute clusters
Proceedings of the 8th ACM European Conference on Computer Systems
Workload management for big data analytics
Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Effective straggler mitigation: attack of the clones
nsdi'13 Proceedings of the 10th USENIX conference on Networked Systems Design and Implementation
Rhea: automatic filtering for unstructured cloud storage
nsdi'13 Proceedings of the 10th USENIX conference on Networked Systems Design and Implementation
Investigating hybrid SSD FTL schemes for Hadoop workloads
Proceedings of the ACM International Conference on Computing Frontiers
Whare-map: heterogeneity in "homogeneous" warehouse-scale computers
Proceedings of the 40th Annual International Symposium on Computer Architecture
Leveraging endpoint flexibility in data-intensive clusters
Proceedings of the ACM SIGCOMM 2013 conference on SIGCOMM
Speeding up distributed request-response workflows
Proceedings of the ACM SIGCOMM 2013 conference on SIGCOMM
MapReduce with communication overlap (MaRCO)
Journal of Parallel and Distributed Computing
Grand challenge: SPRINT stream processing engine as a solution
Proceedings of the 7th ACM international conference on Distributed event-based systems
International Journal of Web and Grid Services
The case for tiny tasks in compute clusters
HotOS'13 Proceedings of the 14th USENIX conference on Hot Topics in Operating Systems
Solving the straggler problem with bounded staleness
HotOS'13 Proceedings of the 14th USENIX conference on Hot Topics in Operating Systems
A case for MapReduce over the internet
Proceedings of the 2013 ACM Cloud and Autonomic Computing Conference
DynamicCloudSim: simulating heterogeneity in computational clouds
Proceedings of the 2nd ACM SIGMOD Workshop on Scalable Workflow Execution Engines and Technologies
Distributed data management using MapReduce
ACM Computing Surveys (CSUR)
HAT: history-based auto-tuning MapReduce in heterogeneous environments
The Journal of Supercomputing
Data-Intensive Cloud Computing: Requirements, Expectations, Challenges, and Solutions
Journal of Grid Computing
Consolidated cluster systems for data centers in the cloud age: a survey and analysis
Frontiers of Computer Science: Selected Publications from Chinese Universities
Developing an optimized application hosting framework in Clouds
Journal of Computer and System Sciences
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
ACM SIGOPS 24th Symposium on Operating Systems Principles
Sparrow: distributed, low latency scheduling
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
Naiad: a timely dataflow system
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
The family of mapreduce and large-scale data processing systems
ACM Computing Surveys (CSUR)
Proceedings of the 4th annual Symposium on Cloud Computing
Limplock: understanding the impact of limpware on scale-out cloud systems
Proceedings of the 4th annual Symposium on Cloud Computing
Hierarchical scheduling for diverse datacenter workloads
Proceedings of the 4th annual Symposium on Cloud Computing
A parallel spatial data analysis infrastructure for the cloud
Proceedings of the 21st ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems
Joint optimization of overlapping phases in MapReduce
Performance Evaluation
Proceedings of the ninth ACM conference on Emerging networking experiments and technologies
PIKACHU: how to rebalance load in optimizing mapreduce on heterogeneous clusters
USENIX ATC'13 Proceedings of the 2013 USENIX conference on Annual Technical Conference
Piranha: optimizing short jobs in Hadoop
Proceedings of the VLDB Endowment
A big data based data storage systems for rock burst experiment
International Journal of Wireless and Mobile Computing
Quasar: resource-efficient and QoS-aware cluster management
Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
MapReduce "garbage" collection
CASCON '13 Proceedings of the 2013 Conference of the Center for Advanced Studies on Collaborative Research
Journal of Parallel and Distributed Computing
Joint optimization of overlapping phases in MapReduce
ACM SIGMETRICS Performance Evaluation Review
SHadoop: Improving MapReduce performance by optimizing job execution mechanism in Hadoop clusters
Journal of Parallel and Distributed Computing
Speeding-up codon analysis on the cloud with local MapReduce aggregation
Information Sciences: an International Journal
An improved partitioning mechanism for optimizing massive data analysis using MapReduce
The Journal of Supercomputing
A MapReduce task scheduling algorithm for deadline constraints
Cluster Computing
A Measurement Study of Data-Intensive Network Traffic Patterns in a Private Cloud
UCC '13 Proceedings of the 2013 IEEE/ACM 6th International Conference on Utility and Cloud Computing
MapReduce framework energy adaptation via temperature awareness
Cluster Computing
SpeQuloS: a QoS service for hybrid and elastic computing infrastructures
Cluster Computing
GRASS: trimming stragglers in approximation analytics
NSDI'14 Proceedings of the 11th USENIX Conference on Networked Systems Design and Implementation
Hi-index | 0.00 |
MapReduce is emerging as an important programming model for large-scale data-parallel applications such as web indexing, data mining, and scientific simulation. Hadoop is an open-source implementation of MapReduce enjoying wide adoption and is often used for short jobs where low response time is critical. Hadoop's performance is closely tied to its task scheduler, which implicitly assumes that cluster nodes are homogeneous and tasks make progress linearly, and uses these assumptions to decide when to speculatively re-execute tasks that appear to be stragglers. In practice, the homogeneity assumptions do not always hold. An especially compelling setting where this occurs is a virtualized data center, such as Amazon's Elastic Compute Cloud (EC2). We show that Hadoop's scheduler can cause severe performance degradation in heterogeneous environments. We design a new scheduling algorithm, Longest Approximate Time to End (LATE), that is highly robust to heterogeneity. LATE can improve Hadoop response times by a factor of 2 in clusters of 200 virtual machines on EC2.