PVM: a framework for parallel distributed computing
Concurrency: Practice and Experience
Encapsulation of parallelism in the Volcano query processing system
SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
Parallel database systems: the future of high performance database systems
Communications of the ACM
The CODE 2.0 graphical parallel programming language
ICS '92 Proceedings of the 6th international conference on Supercomputing
Paralex: an environment for parallel programming in distributed systems
ICS '92 Proceedings of the 6th international conference on Supercomputing
Loading databases using dataflow parallelism
ACM SIGMOD Record
Cilk: an efficient multithreaded runtime system
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
An overview of DB2 parallel edition
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Programming parallel algorithms
Communications of the ACM
Cluster-based scalable network services
Proceedings of the sixteenth ACM symposium on Operating systems principles
ACM Transactions on Computer Systems (TOCS)
ACM Transactions on Computer Systems (TOCS)
P-RIO: A Modular Parallel-Programming Environment
IEEE Concurrency
A Case for NOW (Networks of Workstations)
IEEE Micro
The Gamma Database Machine Project
IEEE Transactions on Knowledge and Data Engineering
Using Cohort-Scheduling to Enhance Server Performance
ATEC '02 Proceedings of the General Track of the annual conference on USENIX Annual Technical Conference
ICCD '02 Proceedings of the 2002 IEEE International Conference on Computer Design: VLSI in Computers and Processors (ICCD'02)
Cg: a system for programming graphics hardware in a C-like language
ACM SIGGRAPH 2003 Papers
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Parallel and Distributed Haskells
Journal of Functional Programming
Highly available, fault-tolerant, parallel dataflows
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Distributed computing in practice: the Condor experience: Research Articles
Concurrency and Computation: Practice & Experience - Grid Performance
Fault-tolerance in the Borealis distributed stream processing system
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Accelerator: using data parallelism to program GPUs for general-purpose uses
Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
Interpreting the data: Parallel analysis with Sawzall
Scientific Programming - Dynamic Grids and Worldwide Computing
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Autonomic operations in cooperative stream processing systems
HotAC II Hot Topics in Autonomic Computing on Hot Topics in Autonomic Computing
Streamware: programming general-purpose multicore processors using streams
Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
Developing a concurrent service orchestration engine in ccr
Proceedings of the 1st international workshop on Multicore software engineering
Proceedings of the 2008 ACM SIGPLAN conference on Programming language design and implementation
Pig latin: a not-so-foreign language for data processing
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
DataLab: transactional data-parallel computing on an active storage cloud
HPDC '08 Proceedings of the 17th international symposium on High performance distributed computing
D3S: debugging deployed distributed systems
NSDI'08 Proceedings of the 5th USENIX Symposium on Networked Systems Design and Implementation
Dcell: a scalable and fault-tolerant network structure for data centers
Proceedings of the ACM SIGCOMM 2008 conference on Data communication
Automatic optimization of parallel dataflow programs
ATC'08 USENIX 2008 Annual Technical Conference on Annual Technical Conference
Toward loosely coupled programming on petascale systems
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Clustera: an integrated computation and data management system
Proceedings of the VLDB Endowment
Scheduling shared scans of large data files
Proceedings of the VLDB Endowment
SCOPE: easy and efficient parallel processing of massive data sets
Proceedings of the VLDB Endowment
Ad-hoc data processing in the cloud
Proceedings of the VLDB Endowment
Large-scale collaborative analysis and extraction of web data
Proceedings of the VLDB Endowment
Data-Continuous SQL Process Model
OTM '08 Proceedings of the OTM 2008 Confederated International Conferences, CoopIS, DOA, GADA, IS, and ODBASE 2008. Part I on On the Move to Meaningful Internet Systems:
Gordon: using flash memory to build fast, power-efficient clusters for data-intensive applications
Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
Improving the responsiveness of internet services with automatic cache placement
Proceedings of the 4th ACM European conference on Computer systems
SLIPstream: scalable low-latency interactive perception on streaming data
Proceedings of the 18th international workshop on Network and operating systems support for digital audio and video
LiteRace: effective sampling for lightweight data-race detection
Proceedings of the 2009 ACM SIGPLAN conference on Programming language design and implementation
Harnessing parallelism in multicore clusters with the all-pairs and wavefront abstractions
Proceedings of the 18th ACM international symposium on High performance distributed computing
Abstract storage: moving file format-specific abstractions intopetabyte-scale storage systems
Proceedings of the second international workshop on Data-aware distributed computing
Tashi: location-aware cluster management
ACDC '09 Proceedings of the 1st workshop on Automated control for datacenters and clouds
MapReduce optimization using regulated dynamic prioritization
Proceedings of the eleventh international joint conference on Measurement and modeling of computer systems
Open-source grid technologies for web-scale computing
ACM SIGACT News
BotGraph: large scale spamming botnet detection
NSDI'09 Proceedings of the 6th USENIX symposium on Networked systems design and implementation
Privacy integrated queries: an extensible platform for privacy-preserving data analysis
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
A comparison of approaches to large-scale data analysis
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Generating example data for dataflow programs
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Distributed data-parallel computing using a high-level programming language
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Engineering the cloud from software modules
CLOUD '09 Proceedings of the 2009 ICSE Workshop on Software Engineering Challenges of Cloud Computing
On single-pass indexing with MapReduce
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Brief announcement: PUSH, a DISC shell
Proceedings of the 28th ACM symposium on Principles of distributed computing
BCube: a high performance, server-centric network architecture for modular data centers
Proceedings of the ACM SIGCOMM 2009 conference on Data communication
Why should we integrate services, servers, and networking in a data center?
Proceedings of the 1st ACM workshop on Research on enterprise networking
Query interactions in database workloads
Proceedings of the Second International Workshop on Testing Database Systems
Inferring Dataflow Properties of User Defined Table Processors
SAS '09 Proceedings of the 16th International Symposium on Static Analysis
A Data Parallel Algorithm for XML DOM Parsing
XSym '09 Proceedings of the 6th International XML Database Symposium on Database and XML Technologies
MapReduce Programming Model for .NET-Based Cloud Computing
Euro-Par '09 Proceedings of the 15th International Euro-Par Conference on Parallel Processing
Searching for Concurrent Design Patterns in Video Games
Euro-Par '09 Proceedings of the 15th International Euro-Par Conference on Parallel Processing
An In-Database Streaming Solution to Multi-camera Fusion
Globe '09 Proceedings of the 2nd International Conference on Data Management in Grid and Peer-to-Peer Systems
Efficiently support MapReduce-like computation models inside parallel DBMS
IDEAS '09 Proceedings of the 2009 International Database Engineering & Applications Symposium
MapReduce and parallel DBMSs: friends or foes?
Communications of the ACM - Amir Pnueli: Ahead of His Time
The multikernel: a new OS architecture for scalable multicore systems
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
Distributed aggregation for data-parallel computing: interfaces and implementations
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
Quincy: fair scheduling for distributed computing clusters
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
The nature of data center traffic: measurements & analysis
Proceedings of the 9th ACM SIGCOMM conference on Internet measurement conference
Composing and executing parallel data-flow graphs with shell pipes
Proceedings of the 4th Workshop on Workflows in Support of Large-Scale Science
Exploring many task computing in scientific workflows
Proceedings of the 2nd Workshop on Many-Task Computing on Grids and Supercomputers
Cloud technologies for bioinformatics applications
Proceedings of the 2nd Workshop on Many-Task Computing on Grids and Supercomputers
Nephele: efficient parallel data processing in the cloud
Proceedings of the 2nd Workshop on Many-Task Computing on Grids and Supercomputers
Query processing of massive trajectory data based on mapreduce
Proceedings of the first international workshop on Cloud data management
The design of distributed real-time video analytic system
Proceedings of the first international workshop on Cloud data management
Proceedings of the 10th ACM/IFIP/USENIX International Conference on Middleware
MDCube: a high performance network structure for modular data center interconnection
Proceedings of the 5th international conference on Emerging networking experiments and technologies
Proceedings of the 2nd International Conference on Interaction Sciences: Information Technology, Culture and Human
Proceedings of the VLDB Endowment
How best to build web-scale data managers?
Proceedings of the VLDB Endowment
Distributed online aggregations
Proceedings of the VLDB Endowment
Biomedical Case Studies in Data Intensive Computing
CloudCom '09 Proceedings of the 1st International Conference on Cloud Computing
Thermal analysis of multiprocessor SoC applications by simulation and verification
ACM Transactions on Design Automation of Electronic Systems (TODAES)
SBotMiner: large scale search bot detection
Proceedings of the third ACM international conference on Web search and data mining
Exploiting multi-level parallelism for low-latency activity recognition in streaming video
MMSys '10 Proceedings of the first annual ACM SIGMM conference on Multimedia systems
Controlling your TV with gestures
Proceedings of the international conference on Multimedia information retrieval
RunTest: assuring integrity of dataflow processing in cloud computing infrastructures
ASIACCS '10 Proceedings of the 5th ACM Symposium on Information, Computer and Communications Security
Boom analytics: exploring data-centric, declarative programming for the cloud
Proceedings of the 5th European conference on Computer systems
HadoopToSQL: a mapReduce query optimizer
Proceedings of the 5th European conference on Computer systems
Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling
Proceedings of the 5th European conference on Computer systems
Towards scalable architectures for clickstream data warehousing
DNIS'07 Proceedings of the 5th international conference on Databases in networked information systems
Beyond online aggregation: parallel and incremental data mining with online Map-Reduce
Proceedings of the 2010 Workshop on Massive Data Analytics on the Cloud
FlumeJava: easy, efficient data-parallel pipelines
PLDI '10 Proceedings of the 2010 ACM SIGPLAN conference on Programming language design and implementation
Stateful bulk processing for incremental analytics
Proceedings of the 1st ACM symposium on Cloud computing
Comet: batched stream processing for data intensive distributed computing
Proceedings of the 1st ACM symposium on Cloud computing
Skew-resistant parallel processing of feature-extracting scientific user-defined functions
Proceedings of the 1st ACM symposium on Cloud computing
Fluxo: a system for internet service programming by non-expert developers
Proceedings of the 1st ACM symposium on Cloud computing
Nephele/PACTs: a programming model and execution framework for web-scale analytical processing
Proceedings of the 1st ACM symposium on Cloud computing
Making cloud intermediate data fault-tolerant
Proceedings of the 1st ACM symposium on Cloud computing
Pregel: a system for large-scale graph processing
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
ParaTimer: a progress indicator for MapReduce DAGs
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Integrating hadoop and parallel DBMs
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Large graph processing in the cloud
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Predictable time-sharing for DryadLINQ cluster
Proceedings of the 7th international conference on Autonomic computing
Privacy integrated queries: an extensible platform for privacy-preserving data analysis
Communications of the ACM
Middleware'09 Proceedings of the ACM/IFIP/USENIX 10th international conference on Middleware
APHID: An architecture for private, high-performance integrated data mining
Future Generation Computer Systems
Parallel programming framework for large batch transaction processing on scale-out systems
Proceedings of the 3rd Annual Haifa Experimental Systems Conference
Toward a cost-effective cloud storage service
ICACT'10 Proceedings of the 12th international conference on Advanced communication technology
Middleware support for many-task computing
Cluster Computing
File-Access Characteristics of Data-Intensive Workflow Applications
CCGRID '10 Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing
A Map-Reduce System with an Alternate API for Multi-core Environments
CCGRID '10 Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing
Symbiotic routing in future data centers
Proceedings of the ACM SIGCOMM 2010 conference
Proceedings of the second ACM SIGCOMM workshop on Networking, systems, and applications on mobile handhelds
MRAP: a novel MapReduce-based framework to support HPC analytics applications with access patterns
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
AzureBlast: a case study of developing science applications on the cloud
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
Cloud computing paradigms for pleasingly parallel biomedical applications
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
Weaver: integrating distributed computing abstractions into scientific workflows using Python
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
Twister: a runtime for iterative MapReduce
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
Tiled-MapReduce: optimizing resource usages of data-parallel applications on multicore with tiling
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Refactoring human roles solves systems problems
HotCloud'09 Proceedings of the 2009 conference on Hot topics in cloud computing
In search of an API for scalable file systems: under the table or above it?
HotCloud'09 Proceedings of the 2009 conference on Hot topics in cloud computing
A common substrate for cluster computing
HotCloud'09 Proceedings of the 2009 conference on Hot topics in cloud computing
DryadInc: reusing work in large-scale computations
HotCloud'09 Proceedings of the 2009 conference on Hot topics in cloud computing
HotDep'08 Proceedings of the Fourth conference on Hot topics in system dependability
HotOS'09 Proceedings of the 12th conference on Hot topics in operating systems
On availability of intermediate data in cloud computations
HotOS'09 Proceedings of the 12th conference on Hot topics in operating systems
FLUXO: a simple service compiler
HotOS'09 Proceedings of the 12th conference on Hot topics in operating systems
Hedera: dynamic flow scheduling for data center networks
NSDI'10 Proceedings of the 7th USENIX conference on Networked systems design and implementation
NSDI'10 Proceedings of the 7th USENIX conference on Networked systems design and implementation
OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
Finding and reproducing Heisenbugs in concurrent programs
OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
Stout: an adaptive interface to scalable cloud storage
USENIXATC'10 Proceedings of the 2010 USENIX conference on USENIX annual technical conference
Hunting for problems with Artemis
WASL'08 Proceedings of the First USENIX conference on Analysis of system logs
Spark: cluster computing with working sets
HotCloud'10 Proceedings of the 2nd USENIX conference on Hot topics in cloud computing
Scripting the cloud with skywriting
HotCloud'10 Proceedings of the 2nd USENIX conference on Hot topics in cloud computing
Scalable clustering algorithm for N-body simulations in a shared-nothing cluster
SSDBM'10 Proceedings of the 22nd international conference on Scientific and statistical database management
Distributed stream processing with DUP
NPC'10 Proceedings of the 2010 IFIP international conference on Network and parallel computing
Proceedings of the FSE/SDP workshop on Future of software engineering research
XML structural similarity search using mapreduce
WAIM'10 Proceedings of the 11th international conference on Web-age information management
High throughput data-compression for cloud storage
Globe'10 Proceedings of the Third international conference on Data management in grid and peer-to-peer systems
Optimizing the pre-processing of scientific visualization techniques using QEF
Proceedings of the 8th International Workshop on Middleware for Grids, Clouds and e-Science
A middleware for parallel processing of large graphs
Proceedings of the 8th International Workshop on Middleware for Grids, Clouds and e-Science
BlobSeer: Next-generation data management for large scale infrastructures
Journal of Parallel and Distributed Computing
HaLoop: efficient iterative data processing on large clusters
Proceedings of the VLDB Endowment
Hadoop++: making a yellow elephant run like a cheetah (without it even noticing)
Proceedings of the VLDB Endowment
DataGarage: warehousing massive performance data on commodity servers
Proceedings of the VLDB Endowment
Massively parallel data analysis with PACTs on Nephele
Proceedings of the VLDB Endowment
Scalable information extraction for web queries
International Journal of Computational Science and Engineering
Knuckles: bringing the database to the data
International Journal of Computational Science and Engineering
Nectar: automatic management of data and computation in datacenters
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
Large-scale incremental processing using distributed transactions and notifications
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
Reining in the outliers in map-reduce clusters using Mantri
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
Piccolo: building fast, distributed programs with partitioned tables
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
SEATTLE: A Scalable Ethernet Architecture for Large Enterprises
ACM Transactions on Computer Systems (TOCS)
Searching the searchers with searchaudit
USENIX Security'10 Proceedings of the 19th USENIX conference on Security
Batch query processing for web search engines
Proceedings of the fourth ACM international conference on Web search and data mining
Scientific Programming - Exploring Languages for Expressing Medium to Massive On-Chip Parallelism
A domain-specific approach to heterogeneous parallelism
Proceedings of the 16th ACM symposium on Principles and practice of parallel programming
Proceedings of the 9th Annual Workshop on Network and Systems Support for Games
CPLDP: an efficient large dataset processing system built on cloud platform
ADMA'10 Proceedings of the 6th international conference on Advanced data mining and applications - Volume Part II
Map-reduce extensions and recursive queries
Proceedings of the 14th International Conference on Extending Database Technology
Energy-delay based provisioning for large datacenters: an energy-efficient and cost optimal approach
Proceedings of the 2nd ACM/SPEC International Conference on Performance engineering
Scalable and cost-effective interconnection of data-center servers using dual server ports
IEEE/ACM Transactions on Networking (TON)
Semi-supervised truth discovery
Proceedings of the 20th international conference on World wide web
FACTO: a fact lookup engine based on web tables
Proceedings of the 20th international conference on World wide web
Application-Tailored I/O with Streamline
ACM Transactions on Computer Systems (TOCS)
Scarlett: coping with skewed content popularity in mapreduce clusters
Proceedings of the sixth conference on Computer systems
Optimizing intermediate data management in MapReduce computations
Proceedings of the First International Workshop on Cloud Computing Platforms
ASTERIX: towards a scalable, semistructured data platform for evolving-world models
Distributed and Parallel Databases
TritonSort: a balanced large-scale sorting system
Proceedings of the 8th USENIX conference on Networked systems design and implementation
CIEL: a universal execution engine for distributed data-flow computing
Proceedings of the 8th USENIX conference on Networked systems design and implementation
Mesos: a platform for fine-grained resource sharing in the data center
Proceedings of the 8th USENIX conference on Networked systems design and implementation
Dominant resource fairness: fair allocation of multiple resource types
Proceedings of the 8th USENIX conference on Networked systems design and implementation
An overview of business intelligence technology
Communications of the ACM
Towards improved load balancing for data intensive distributed computing
Proceedings of the 2011 ACM Symposium on Applied Computing
Brasil: basic resource aggregation system infrastructure layer
Proceedings of the 1st International Workshop on Runtime and Operating Systems for Supercomputers
A latency and fault-tolerance optimizer for online parallel query plans
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Schedule optimization for data processing flows on the cloud
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Processing theta-joins using MapReduce
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Fast personalized PageRank on MapReduce
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
A batch of PNUTS: experiences connecting cloud batch and serving systems
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Automated partitioning design in parallel database systems
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Structuring the unstructured middle with chunk computing
HotOS'13 Proceedings of the 13th USENIX conference on Hot topics in operating systems
Disk-locality in datacenter computing considered irrelevant
HotOS'13 Proceedings of the 13th USENIX conference on Hot topics in operating systems
Optimizing data partitioning for data-parallel computing
HotOS'13 Proceedings of the 13th USENIX conference on Hot topics in operating systems
Non-deterministic parallelism considered useful
HotOS'13 Proceedings of the 13th USENIX conference on Hot topics in operating systems
Operating systems must support GPU abstractions
HotOS'13 Proceedings of the 13th USENIX conference on Hot topics in operating systems
Steno: automatic optimization of declarative queries
Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation
MDR: performance model driven runtime for heterogeneous parallel platforms
Proceedings of the international conference on Supercomputing
Adaptive data-driven service integrity attestation for multi-tenant cloud systems
Proceedings of the Nineteenth International Workshop on Quality of Service
Otus: resource attribution in data-intensive clusters
Proceedings of the second international workshop on MapReduce and its applications
Proceedings of the second international workshop on MapReduce and its applications
The case for being lazy: how to leverage lazy evaluation in MapReduce
Proceedings of the 2nd international workshop on Scientific cloud computing
ARIA: automatic resource inference and allocation for mapreduce environments
Proceedings of the 8th ACM international conference on Autonomic computing
Odessa: enabling interactive perception applications on mobile devices
MobiSys '11 Proceedings of the 9th international conference on Mobile systems, applications, and services
HiTune: dataflow-based performance analysis for big data cloud
USENIXATC'11 Proceedings of the 2011 USENIX conference on USENIX annual technical conference
In-situ MapReduce for log processing
USENIXATC'11 Proceedings of the 2011 USENIX conference on USENIX annual technical conference
G2: a graph processing system for diagnosing distributed systems
USENIXATC'11 Proceedings of the 2011 USENIX conference on USENIX annual technical conference
TidyFS: a simple and small distributed file system
USENIXATC'11 Proceedings of the 2011 USENIX conference on USENIX annual technical conference
Cost optimized provisioning of elastic resources for application workflows
Future Generation Computer Systems
Remote sensing image information mining with HPC cluster and DryadLINQ
Proceedings of the 49th Annual Southeast Regional Conference
Better never than late: meeting deadlines in datacenter networks
Proceedings of the ACM SIGCOMM 2011 conference
Managing data transfers in computer clusters with orchestra
Proceedings of the ACM SIGCOMM 2011 conference
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Data-intensive science: The Terapixel and MODISAzure projects
International Journal of High Performance Computing Applications
New ideas track: testing mapreduce-style programs
Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering
CloudFuice: a flexible cloud-based data integration system
ICWE'11 Proceedings of the 11th international conference on Web engineering
On the benefits of transparent compression for cost-effective cloud data storage
Transactions on large-scale data- and knowledge-centered systems III
Tagged mapreduce: efficiently computing multi-analytics using mapreduce
DaWaK'11 Proceedings of the 13th international conference on Data warehousing and knowledge discovery
Disco: a computing platform for large-scale data analytics
Proceedings of the 10th ACM SIGPLAN workshop on Erlang
Proceedings of the 4th ACM symposium on Haskell
Mining large distributed log data in near real time
SLAML '11 Managing Large-scale Systems via the Analysis of System Logs and the Application of Machine Learning Techniques
Proceedings of the 2nd ACM Symposium on Cloud Computing
PrIter: a distributed framework for prioritized iterative computations
Proceedings of the 2nd ACM Symposium on Cloud Computing
Orleans: cloud computing for everyone
Proceedings of the 2nd ACM Symposium on Cloud Computing
Small cache, big effect: provable load balancing for randomly partitioned cluster services
Proceedings of the 2nd ACM Symposium on Cloud Computing
Utilizing green energy prediction to schedule mixed batch and service jobs in data centers
HotPower '11 Proceedings of the 4th Workshop on Power-Aware Computing and Systems
Query engine grid for executing SQL streaming process
Globe'11 Proceedings of the 4th international conference on Data management in grid and peer-to-peer systems
PTask: operating system abstractions to manage GPUs as compute devices
SOSP '11 Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles
Detecting failures in distributed systems with the Falcon spy network
SOSP '11 Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles
Fay: extensible distributed tracing from kernels to clusters
SOSP '11 Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles
CrowdForge: crowdsourcing complex work
Proceedings of the 24th annual ACM symposium on User interface software and technology
The jabberwocky programming environment for structured social computing
Proceedings of the 24th annual ACM symposium on User interface software and technology
Improving parallel data flow support in a visualization and steering environment
AICT'11 Proceedings of the 2nd international conference on Applied informatics and computing theory
Auto-scaling to minimize cost and meet application deadlines in cloud workflows
Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
Cloud-DLS: Dynamic trusted scheduling for Cloud computing
Expert Systems with Applications: An International Journal
Scalable manipulation of archival web graphs
Proceedings of the 9th workshop on Large-scale and distributed informational retrieval
Programming micro-aerial vehicle swarms with karma
Proceedings of the 9th ACM Conference on Embedded Networked Sensor Systems
Processing of multimedia data using the P2G framework
MM '11 Proceedings of the 19th ACM international conference on Multimedia
SQL streaming process in query engine net
OTM'11 Proceedings of the 2011th Confederated international conference on On the move to meaningful internet systems - Volume Part I
An approach for processing large and non-uniform media objects on mapreduce-based clusters
ICADL'11 Proceedings of the 13th international conference on Asia-pacific digital libraries: for cultural heritage, knowledge dissemination, and future creation
Evaluating the suitability of mapreduce for surface temperature analysis codes
Proceedings of the second international workshop on Data intensive computing in the clouds
Dynamic split model of resource utilization in MapReduce
Proceedings of the second international workshop on Data intensive computing in the clouds
Design patterns for scientific applications in DryadLINQ CTP
Proceedings of the second international workshop on Data intensive computing in the clouds
ChuQL: processing XML with XQuery using Hadoop
Proceedings of the 2011 Conference of the Center for Advanced Studies on Collaborative Research
Utilizing green energy prediction to schedule mixed batch and service jobs in data centers
ACM SIGOPS Operating Systems Review
Parallel data processing with MapReduce: a survey
ACM SIGMOD Record
Cloudscape: language support to coordinate and control distributed applications in the cloud
Proceedings of the compilation of the co-located workshops on DSM'11, TMC'11, AGERE!'11, AOOPES'11, NEAT'11, & VMIL'11
More convenient more overhead: the performance evaluation of Hadoop streaming
Proceedings of the 2011 ACM Symposium on Research in Applied Computation
SPECTRE: speculation to hide communication latency
Proceedings of the Second Asia-Pacific Workshop on Systems
Of hammers and nails: an empirical comparison of three paradigms for processing large graphs
Proceedings of the fifth ACM international conference on Web search and data mining
Case study of scientific data processing on a cloud using hadoop
HPCS'09 Proceedings of the 23rd international conference on High Performance Computing Systems and Applications
Scalable splitting of massive data streams
DASFAA'10 Proceedings of the 15th international conference on Database Systems for Advanced Applications - Volume Part II
Riding the elephant: managing ensembles with hadoop
Proceedings of the 2011 ACM international workshop on Many task computing on grids and supercomputers
Function flow: making synchronization easier in task parallelism
Proceedings of the 2012 International Workshop on Programming Models and Applications for Multicores and Manycores
Mitigating the negative impact of preemption on heterogeneous MapReduce workloads
Proceedings of the 7th International Conference on Network and Services Management
Tarazu: optimizing MapReduce on heterogeneous clusters
ASPLOS XVII Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems
DVM: towards a datacenter-scale virtual machine
VEE '12 Proceedings of the 8th ACM SIGPLAN/SIGOPS conference on Virtual Execution Environments
A study on workload imbalance issues in data intensive distributed computing
DNIS'10 Proceedings of the 6th international conference on Databases in Networked Information Systems
Meeting service level objectives of Pig programs
Proceedings of the 2nd International Workshop on Cloud Computing Platforms
Jockey: guaranteed job latency in data parallel clusters
Proceedings of the 7th ACM european conference on Computer Systems
MadLINQ: large-scale distributed matrix computation for the cloud
Proceedings of the 7th ACM european conference on Computer Systems
The datacenter needs an operating system
HotCloud'11 Proceedings of the 3rd USENIX conference on Hot topics in cloud computing
TransMR: data-centric programming beyond data parallelism
HotCloud'11 Proceedings of the 3rd USENIX conference on Hot topics in cloud computing
HiTune: dataflow-based performance analysis for big data cloud
HotCloud'11 Proceedings of the 3rd USENIX conference on Hot topics in cloud computing
In-situ MapReduce for log processing
HotCloud'11 Proceedings of the 3rd USENIX conference on Hot topics in cloud computing
CC'10/ETAPS'10 Proceedings of the 19th joint European conference on Theory and Practice of Software, international conference on Compiler Construction
PerfXplain: debugging MapReduce job performance
Proceedings of the VLDB Endowment
The search for energy-efficient building blocks for the data center
ISCA'10 Proceedings of the 2010 international conference on Computer Architecture
Cluster computing, recursion and datalog
Datalog'10 Proceedings of the First international conference on Datalog Reloaded
Resource provisioning framework for mapreduce jobs with performance goals
Middleware'11 Proceedings of the 12th ACM/IFIP/USENIX international conference on Middleware
Resource-aware adaptive scheduling for mapreduce clusters
Middleware'11 Proceedings of the 12th ACM/IFIP/USENIX international conference on Middleware
The HaLoop approach to large-scale iterative data analysis
The VLDB Journal — The International Journal on Very Large Data Bases
Bayesian Cognitive Model in Scheduling Algorithm for Data Intensive Computing
Journal of Grid Computing
Distributed GraphLab: a framework for machine learning and data mining in the cloud
Proceedings of the VLDB Endowment
Advanced partitioning techniques for massively distributed computation
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
SkewTune: mitigating skew in mapreduce applications
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Optimizing analytic data flows for multiple execution engines
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
CloudRAMSort: fast and efficient large-scale distributed RAM sort on shared-nothing cluster
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
NaaS: network-as-a-service in the cloud
Hot-ICE'12 Proceedings of the 2nd USENIX conference on Hot Topics in Management of Internet, Cloud, and Enterprise Networks and Services
Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing
NSDI'12 Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation
Camdoop: exploiting in-network aggregation for big data applications
NSDI'12 Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation
PACMan: coordinated memory caching for parallel jobs
NSDI'12 Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation
Re-optimizing data-parallel computing
NSDI'12 Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation
Optimizing data shuffling in data-parallel computation by understanding user-defined functions
NSDI'12 Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation
A service-oriented taxonomical spectrum, cloudy challenges and opportunities of cloud computing
International Journal of Communication Systems
Inside "Big Data management": ogres, onions, or parfaits?
Proceedings of the 15th International Conference on Extending Database Technology
An optimization framework for map-reduce queries
Proceedings of the 15th International Conference on Extending Database Technology
Transitive closure and recursive Datalog implemented on clusters
Proceedings of the 15th International Conference on Extending Database Technology
Adaptive MapReduce using situation-aware mappers
Proceedings of the 15th International Conference on Extending Database Technology
Dynamic trust evaluation and scheduling framework for cloud computing
Security and Communication Networks
Swift: A language for distributed parallel scripting
Parallel Computing
Pilot-MapReduce: an extensible and flexible MapReduce implementation for distributed data
Proceedings of third international workshop on MapReduce and its Applications Date
Improving the diagnosis of mild hypertrophic cardiomyopathy with MapReduce
Proceedings of third international workshop on MapReduce and its Applications Date
Accelerate large-scale iterative computation through asynchronous accumulative updates
Proceedings of the 3rd workshop on Scientific Cloud Computing Date
Putting a "big-data" platform to good use: training kinect
Proceedings of the 21st international symposium on High-Performance Parallel and Distributed Computing
Locality-aware dynamic VM reconfiguration on MapReduce clouds
Proceedings of the 21st international symposium on High-Performance Parallel and Distributed Computing
Massively-parallel stream processing under QoS constraints with Nephele
Proceedings of the 21st international symposium on High-Performance Parallel and Distributed Computing
Adaptive heterogeneous language support within a cloud runtime
Future Generation Computer Systems
Towards efficient data search and subsetting of large-scale atmospheric datasets
Future Generation Computer Systems
Composition of engineering web services with universal distributed data-flows framework based on ROA
Proceedings of the Third International Workshop on RESTful Design
Resource Management for Elastic Cloud Workflows
CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
Optimizing Completion Time and Resource Provisioning of Pig Programs
CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
Investigation of Data Locality in MapReduce
CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
Maestro: Replica-Aware Map Scheduling for MapReduce
CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
FunSQL: it is time to make SQL functional
Proceedings of the 2012 Joint EDBT/ICDT Workshops
Stormy: an elastic and highly available streaming service in the cloud
Proceedings of the 2012 Joint EDBT/ICDT Workshops
From a calculus to an execution environment for stream processing
Proceedings of the 6th ACM International Conference on Distributed Event-Based Systems
MapReduce indexing strategies: Studying scalability and efficiency
Information Processing and Management: an International Journal
Reference deployment models for eliminating user concerns on cloud security
The Journal of Supercomputing
A framework for robust discovery of entity synonyms
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Programming your network at run-time for big data applications
Proceedings of the first workshop on Hot topics in software defined networks
The seven deadly sins of cloud computing research
HotCloud'12 Proceedings of the 4th USENIX conference on Hot Topics in Cloud Ccomputing
A case for performance-centric network allocation
HotCloud'12 Proceedings of the 4th USENIX conference on Hot Topics in Cloud Ccomputing
Using R for iterative and incremental processing
HotCloud'12 Proceedings of the 4th USENIX conference on Hot Topics in Cloud Ccomputing
MixApart: decoupled analytics for shared storage systems
HotStorage'12 Proceedings of the 4th USENIX conference on Hot Topics in Storage and File Systems
Composable reliability for asynchronous systems
USENIX ATC'12 Proceedings of the 2012 USENIX conference on Annual Technical Conference
A MapReduce-supported network structure for data centers
Concurrency and Computation: Practice & Experience
Oolong: asynchronous distributed applications made easy
Proceedings of the Asia-Pacific Workshop on Systems
Opening the black boxes in data flow optimization
Proceedings of the VLDB Endowment
Spinning fast iterative data flows
Proceedings of the VLDB Endowment
REX: recursive, delta-based data-centric computation
Proceedings of the VLDB Endowment
Optimization of analytic data flows for next generation business intelligence applications
TPCTC'11 Proceedings of the Third TPC Technology conference on Topics in Performance Evaluation, Measurement and Characterization
SkewTune in action: mitigating skew in MapReduce applications
Proceedings of the VLDB Endowment
Efficient big data processing in Hadoop MapReduce
Proceedings of the VLDB Endowment
Auto-parallelizing stateful distributed streaming applications
Proceedings of the 21st international conference on Parallel architectures and compilation techniques
Automated profiling and resource management of pig programs for meeting service level objectives
Proceedings of the 9th international conference on Autonomic computing
Survey of scheduling techniques for addressing shared resources in multicore processors
ACM Computing Surveys (CSUR)
Populated IP addresses: classification and applications
Proceedings of the 2012 ACM conference on Computer and communications security
Fay: Extensible Distributed Tracing from Kernels to Clusters
ACM Transactions on Computer Systems (TOCS)
Boa: analyzing ultra-large-scale code corpus
Proceedings of the 3rd annual conference on Systems, programming, and applications: software for humanity
Data-intensive architecture for scientific knowledge discovery
Distributed and Parallel Databases
SCOPE: parallel databases meet MapReduce
The VLDB Journal — The International Journal on Very Large Data Bases
Oolong: asynchronous distributed applications made easy
APSys'12 Proceedings of the Third ACM SIGOPS Asia-Pacific conference on Systems
OSDI'12 Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation
Spotting code optimizations in data-parallel pipelines through PeriSCOPE
OSDI'12 Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation
Scripting distributed scientific workflows using Weaver
Concurrency and Computation: Practice & Experience
Type 2 slowly changing dimensions: a case study using the cooperating system
Proceedings of the fifteenth international workshop on Data warehousing and OLAP
Coflow: a networking abstraction for cluster applications
Proceedings of the 11th ACM Workshop on Hot Topics in Networks
Improving large graph processing on partitioned graphs in the cloud
Proceedings of the Third ACM Symposium on Cloud Computing
Sailfish: a framework for large scale data processing
Proceedings of the Third ACM Symposium on Cloud Computing
Heterogeneity and dynamicity of clouds at scale: Google trace analysis
Proceedings of the Third ACM Symposium on Cloud Computing
Bridging the tenant-provider gap in cloud services
Proceedings of the Third ACM Symposium on Cloud Computing
True elasticity in multi-tenant data-intensive compute clusters
Proceedings of the Third ACM Symposium on Cloud Computing
Distributed adaptive routing for big-data applications running on data center networks
Proceedings of the eighth ACM/IEEE symposium on Architectures for networking and communications systems
Scheduling mapreduce jobs in HPC clusters
Euro-Par'12 Proceedings of the 18th international conference on Parallel Processing
Scalable distributed architecture for media transcoding
ICA3PP'12 Proceedings of the 12th international conference on Algorithms and Architectures for Parallel Processing - Volume Part I
Design and implementation of GXP make - A workflow system based on make
Future Generation Computer Systems
Data-Intensive Workload Consolidation for the Hadoop Distributed File System
GRID '12 Proceedings of the 2012 ACM/IEEE 13th International Conference on Grid Computing
Datacast: a scalable and efficient reliable group data delivery service for data centers
Proceedings of the 8th international conference on Emerging networking experiments and technologies
Resource provisioning framework for MapReduce jobs with performance goals
Proceedings of the 12th International Middleware Conference
Resource-aware adaptive scheduling for MapReduce clusters
Proceedings of the 12th International Middleware Conference
Optimizing large-scale Semi-Naïve datalog evaluation in hadoop
Datalog 2.0'12 Proceedings of the Second international conference on Datalog in Academia and Industry
Cogset: a high performance MapReduce engine
Concurrency and Computation: Practice & Experience
TritonSort: A Balanced and Energy-Efficient Large-Scale Sorting System
ACM Transactions on Computer Systems (TOCS)
SemanMR: big data processing framework based on semantics
Proceedings of the Fourth Asia-Pacific Symposium on Internetware
Maguro, a system for indexing and searching over very large text collections
Proceedings of the sixth ACM international conference on Web search and data mining
On the performance of high dimensional data clustering and classification algorithms
Future Generation Computer Systems
Future Generation Computer Systems
Network-Based inference algorithm on hadoop
ISMIS'12 Proceedings of the 20th international conference on Foundations of Intelligent Systems
MobiS: a distributed paradigm of mobile sensor data analytics for evaluating environmental exposures
Proceedings of the First ACM SIGSPATIAL International Workshop on Mobile Geographic Information Systems
Makeflow: a portable abstraction for data intensive computing on clusters, clouds, and grids
Proceedings of the 1st ACM SIGMOD Workshop on Scalable Workflow Execution Engines and Technologies
Turbine: a distributed-memory dataflow engine for extreme-scale many-task applications
Proceedings of the 1st ACM SIGMOD Workshop on Scalable Workflow Execution Engines and Technologies
Tiled-MapReduce: Efficient and Flexible MapReduce Processing on Multicore with Tiling
ACM Transactions on Architecture and Code Optimization (TACO)
Unikernels: library operating systems for the cloud
Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems
Breaking the MapReduce stage barrier
Cluster Computing
A Multiclass Classification Tool Using Cloud Computing Architecture
ASONAM '12 Proceedings of the 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012)
Revisiting flow-based load balancing: Stateless path selection in data center networks
Computer Networks: The International Journal of Computer and Telecommunications Networking
Incremental stream processing using computational conflict-free replicated data types
Proceedings of the 3rd International Workshop on Cloud Data and Platforms
CamCubeOS: a key-based network stack for 3D torus cluster topologies
Proceedings of the 22nd international symposium on High-performance parallel and distributed computing
Shark: SQL and rich analytics at scale
Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Optimus: a dynamic rewriting framework for data-parallel execution plans
Proceedings of the 8th ACM European Conference on Computer Systems
Presto: distributed machine learning and graph processing with sparse matrices
Proceedings of the 8th ACM European Conference on Computer Systems
Choosy: max-min fair sharing for datacenter jobs with constraints
Proceedings of the 8th ACM European Conference on Computer Systems
CPI2: CPU performance isolation for shared compute clusters
Proceedings of the 8th ACM European Conference on Computer Systems
Proceedings of the 16th International ACM Sigsoft symposium on Component-based software engineering
A framework for partitioning and execution of data stream applications in mobile cloud computing
ACM SIGMETRICS Performance Evaluation Review
On distributed computation rate optimization for deploying cloud computing programming frameworks
ACM SIGMETRICS Performance Evaluation Review
Input data organization for batch processing in time window based computations
Proceedings of the 28th Annual ACM Symposium on Applied Computing
Effective straggler mitigation: attack of the clones
nsdi'13 Proceedings of the 10th USENIX conference on Networked Systems Design and Implementation
DeepSea: self-adaptive data partitioning and replication in scalable distributed data systems
Proceedings of the 2013 Sigmod/PODS Ph.D. symposium on PhD symposium
HyMR: a hybrid MapReduce workflow system
Proceedings of the 3rd international workshop on Emerging computational methods for the life sciences
Octopus: efficient data intensive computing on virtualized datacenters
Proceedings of the 6th International Systems and Storage Conference
LINQits: big data on little clients
Proceedings of the 40th Annual International Symposium on Computer Architecture
MapReduce with communication overlap (MaRCO)
Journal of Parallel and Distributed Computing
A case for dynamic memory partitioning in data centers
Proceedings of the Second Workshop on Data Analytics in the Cloud
Assisting developers of big data analytics applications when deploying on hadoop clouds
Proceedings of the 2013 International Conference on Software Engineering
Boa: a language and infrastructure for analyzing ultra-large-scale software repositories
Proceedings of the 2013 International Conference on Software Engineering
A characteristic study on failures of production distributed data-parallel programs
Proceedings of the 2013 International Conference on Software Engineering
TurboGraph: a fast parallel graph engine handling billion-scale graphs in a single PC
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Data-Fu: a language and an interpreter for interaction with read/write linked data
Proceedings of the 22nd international conference on World Wide Web
Large-scale computation not at the cost of expressiveness
HotOS'13 Proceedings of the 14th USENIX conference on Hot Topics in Operating Systems
A case for MapReduce over the internet
Proceedings of the 2013 ACM Cloud and Autonomic Computing Conference
New wine in old skins: the case for distributed operating systems in the data center
Proceedings of the 4th Asia-Pacific Workshop on Systems
A survey of pipelined workflow scheduling: Models and algorithms
ACM Computing Surveys (CSUR)
Distributed data management using MapReduce
ACM Computing Surveys (CSUR)
Scalable Data Processing for Community Sensing Applications
Mobile Networks and Applications
Boosting energy efficiency with mirrored data block replication policy and energy scheduler
ACM SIGOPS Operating Systems Review
Data-Intensive Cloud Computing: Requirements, Expectations, Challenges, and Solutions
Journal of Grid Computing
Turning nondeterminism into parallelism
Proceedings of the 2013 ACM SIGPLAN international conference on Object oriented programming systems languages & applications
Forge: generating a high performance DSL implementation from a declarative specification
Proceedings of the 12th international conference on Generative programming: concepts & experiences
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
ACM SIGOPS 24th Symposium on Operating Systems Principles
Performance Modeling and Optimization of Deadline-Driven Pig Programs
ACM Transactions on Autonomous and Adaptive Systems (TAAS)
Dandelion: a compiler and runtime for heterogeneous systems
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
Tango: distributed data structures over a shared log
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
Discretized streams: fault-tolerant streaming computation at scale
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
Naiad: a timely dataflow system
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
The family of mapreduce and large-scale data processing systems
ACM Computing Surveys (CSUR)
EventWave: programming model and runtime support for tightly-coupled elastic cloud applications
Proceedings of the 4th annual Symposium on Cloud Computing
Scalable lineage capture for debugging DISC analytics
Proceedings of the 4th annual Symposium on Cloud Computing
Memory-efficient groupby-aggregate using compressed buffer trees
Proceedings of the 4th annual Symposium on Cloud Computing
Apache Hadoop YARN: yet another resource negotiator
Proceedings of the 4th annual Symposium on Cloud Computing
CG_Hadoop: computational geometry in MapReduce
Proceedings of the 21st ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems
Proceedings of the Seventh Workshop on Programming Languages and Operating Systems
A catalog of stream processing optimizations
ACM Computing Surveys (CSUR)
PonIC: using stratosphere to speed up pig analytics
Euro-Par'13 Proceedings of the 19th international conference on Parallel Processing
MR-runner: a modularized map-reduce job management tool
Proceedings of the 5th Asia-Pacific Symposium on Internetware
Joint optimization of overlapping phases in MapReduce
Performance Evaluation
UpSizeR: Synthetically scaling an empirical relational database
Information Systems
Semantics and provenance for processing element composition in dispel workflows
WORKS '13 Proceedings of the 8th Workshop on Workflows in Support of Large-Scale Science
Design of an active storage cluster file system for DAG workflows
DISCS-2013 Proceedings of the 2013 International Workshop on Data-Intensive Scalable Computing Systems
On limitations of network acceleration
Proceedings of the ninth ACM conference on Emerging networking experiments and technologies
Continuous cloud-scale query optimization and processing
Proceedings of the VLDB Endowment
REEF: retainable evaluator execution framework
Proceedings of the VLDB Endowment
Active data: a data-centric approach to data life-cycle management
PDSW '13 Proceedings of the 8th Parallel Data Storage Workshop
Integrating big data into the computing curricula
Proceedings of the 45th ACM technical symposium on Computer science education
Proceedings of the Ninth IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis
Distributed socialite: a datalog-based language for large-scale graph analysis
Proceedings of the VLDB Endowment
A Novel Cost-Effective Interconnection Networks of Modular Datacenters for the Cloud Computing
Proceedings of the Second International Conference on Innovative Computing and Cloud Computing
ComMapReduce: An improvement of MapReduce with lightweight communication mechanisms
Data & Knowledge Engineering
Modeling and optimizing large-scale data flows
Future Generation Computer Systems
Joint optimization of overlapping phases in MapReduce
ACM SIGMETRICS Performance Evaluation Review
A MapReduce task scheduling algorithm for deadline constraints
Cluster Computing
MixApart: decoupled analytics for shared storage systems
FAST'13 Proceedings of the 11th USENIX conference on File and Storage Technologies
Hybrid Analytic Flows-the Case for Optimization
Fundamenta Informaticae - Scalable Workflow Enactment Engines and Technology
Turbine: A Distributed-memory Dataflow Engine for High Performance Many-task Applications
Fundamenta Informaticae - Scalable Workflow Enactment Engines and Technology
Nephele streaming: stream processing under QoS constraints at scale
Cluster Computing
A platform for eXtreme analytics
IBM Journal of Research and Development
IBM streams processing language: analyzing big data in motion
IBM Journal of Research and Development
Aggregation and degradation in JetStream: streaming analytics in the wide area
NSDI'14 Proceedings of the 11th USENIX Conference on Networked Systems Design and Implementation
GRASS: trimming stragglers in approximation analytics
NSDI'14 Proceedings of the 11th USENIX Conference on Networked Systems Design and Implementation
Hi-index | 0.03 |
Dryad is a general-purpose distributed execution engine for coarse-grain data-parallel applications. A Dryad application combines computational "vertices" with communication "channels" to form a dataflow graph. Dryad runs the application by executing the vertices of this graph on a set of available computers, communicating as appropriate through flies, TCP pipes, and shared-memory FIFOs. The vertices provided by the application developer are quite simple and are usually written as sequential programs with no thread creation or locking. Concurrency arises from Dryad scheduling vertices to run simultaneously on multiple computers, or on multiple CPU cores within a computer. The application can discover the size and placement of data at run time, and modify the graph as the computation progresses to make efficient use of the available resources. Dryad is designed to scale from powerful multi-core single computers, through small clusters of computers, to data centers with thousands of computers. The Dryad execution engine handles all the difficult problems of creating a large distributed, concurrent application: scheduling the use of computers and their CPUs, recovering from communication or computer failures, and transporting data between vertices.