Modern Information Retrieval
Fast matrix multiplies using graphics hardware
Proceedings of the 2001 ACM/IEEE conference on Supercomputing
Fast computation of database operations using graphics processors
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Brook for GPUs: stream computing on graphics hardware
ACM SIGGRAPH 2004 Papers
Automatic Tuning Matrix Multiplication Performance on Graphics Hardware
Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
ACM SIGGRAPH 2006 Papers
GPUTeraSort: high performance graphics co-processor sorting for large database management
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Query co-processing on commodity processors
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Accelerator: using data parallelism to program GPUs for general-purpose uses
Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
Map-reduce-merge: simplified relational data processing on large clusters
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Scan primitives for GPU computing
Proceedings of the 22nd ACM SIGGRAPH/EUROGRAPHICS symposium on Graphics hardware
Fast Schedulability Analysis Using Commodity Graphics Hardware
RTCSA '07 Proceedings of the 13th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications
Streaming Algorithms for Biological Sequence Alignment on GPUs
IEEE Transactions on Parallel and Distributed Systems
Evaluating MapReduce for Multi-core and Multiprocessor Systems
HPCA '07 Proceedings of the 2007 IEEE 13th International Symposium on High Performance Computer Architecture
MapReduce: simplified data processing on large clusters
Communications of the ACM - 50th anniversary issue: 1958 - 2008
Merge: a programming model for heterogeneous multi-core systems
Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
Efficient gather and scatter operations on graphics processors
Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Relational joins on graphics processors
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Gpu gems 3
Using graphics processors for high performance IR query processing
Proceedings of the 18th international conference on World wide web
Supporting MapReduce on large-scale asymmetric multi-core clusters
ACM SIGOPS Operating Systems Review
A translation system for enabling data mining applications on GPUs
Proceedings of the 23rd international conference on Supercomputing
MapReduce optimization using regulated dynamic prioritization
Proceedings of the eleventh international joint conference on Measurement and modeling of computer systems
Frequent itemset mining on graphics processors
Proceedings of the Fifth International Workshop on Data Management on New Hardware
Towards Efficient MapReduce Using MPI
Proceedings of the 16th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Evaluating SPLASH-2 Applications Using MapReduce
APPT '09 Proceedings of the 8th International Symposium on Advanced Parallel Processing Technologies
A Skeletal Parallel Framework with Fusion Optimizer for GPGPU Programming
APLAS '09 Proceedings of the 7th Asian Symposium on Programming Languages and Systems
FPMR: MapReduce framework on FPGA
Proceedings of the 18th annual ACM/SIGDA international symposium on Field programmable gate arrays
Accelerating SQL database operations on a GPU with CUDA
Proceedings of the 3rd Workshop on General-Purpose Computation on Graphics Processing Units
Proceedings of the 24th ACM International Conference on Supercomputing
Assigning tasks for efficiency in Hadoop: extended abstract
Proceedings of the twenty-second annual ACM symposium on Parallelism in algorithms and architectures
Misco: a MapReduce framework for mobile systems
Proceedings of the 3rd International Conference on PErvasive Technologies Related to Assistive Environments
Designing Accelerator-Based Distributed Systems for High Performance
CCGRID '10 Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing
A Map-Reduce System with an Alternate API for Multi-core Environments
CCGRID '10 Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing
MapReduce for the cell broadband engine architecture
IBM Journal of Research and Development
Run-time optimizations for replicated dataflows on heterogeneous environments
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
Multi-GPU volume rendering using MapReduce
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
MR-scope: a real-time tracing tool for MapReduce
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
MapCG: writing parallel program portable between CPU and GPU
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Tiled-MapReduce: optimizing resource usages of data-parallel applications on multicore with tiling
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Distributed systems meet economics: pricing in the cloud
HotCloud'10 Proceedings of the 2nd USENIX conference on Hot topics in cloud computing
A MapReduce approach to Gi*(d) spatial statistic
Proceedings of the ACM SIGSPATIAL International Workshop on High Performance and Distributed Geographic Information Systems
Proceedings of the ACM SIGSPATIAL International Workshop on GeoStreaming
GPU-accelerated predicate evaluation on column store
WAIM'10 Proceedings of the 11th international conference on Web-age information management
Functional Partitioning to Optimize End-to-End Performance on Many-core Architectures
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
A capabilities-aware framework for using computational accelerators in data-intensive computing
Journal of Parallel and Distributed Computing
Database compression on graphics processors
Proceedings of the VLDB Endowment
Dynamic proportional share scheduling in Hadoop
JSSPP'10 Proceedings of the 15th international conference on Job scheduling strategies for parallel processing
GPU-based FFT computation for multi-gigabit wirelessHD baseband processing
EURASIP Journal on Wireless Communications and Networking
Variable-sized map and locality-aware reduce on public-resource grids
Future Generation Computer Systems
A load-aware scheduler for MapReduce framework in heterogeneous cloud environments
Proceedings of the 2011 ACM Symposium on Applied Computing
Garbage collection auto-tuning for Java mapreduce on multi-cores
Proceedings of the international symposium on Memory management
Phoenix++: modular MapReduce for shared-memory systems
Proceedings of the second international workshop on MapReduce and its applications
Scheduling for real-time mobile MapReduce systems
Proceedings of the 5th ACM international conference on Distributed event-based system
PTask: operating system abstractions to manage GPUs as compute devices
SOSP '11 Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles
Benchmarking MapReduce Implementations for Application Usage Scenarios
GRID '11 Proceedings of the 2011 IEEE/ACM 12th International Conference on Grid Computing
Parallel data processing with MapReduce: a survey
ACM SIGMOD Record
More convenient more overhead: the performance evaluation of Hadoop streaming
Proceedings of the 2011 ACM Symposium on Research in Applied Computation
Variable-Sized map and locality-aware reduce on public-resource grids
GPC'10 Proceedings of the 5th international conference on Advances in Grid and Pervasive Computing
An overview of Medusa: simplified graph processing on GPUs
Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming
Tarazu: optimizing MapReduce on heterogeneous clusters
ASPLOS XVII Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems
DVM: towards a datacenter-scale virtual machine
VEE '12 Proceedings of the 8th ACM SIGPLAN/SIGOPS conference on Virtual Execution Environments
Ameliorating memory contention of OLAP operators on GPU processors
DaMoN '12 Proceedings of the Eighth International Workshop on Data Management on New Hardware
X-device query processing by bitwise distribution
DaMoN '12 Proceedings of the Eighth International Workshop on Data Management on New Hardware
Mapping a data-flow programming model onto heterogeneous platforms
Proceedings of the 13th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, Tools and Theory for Embedded Systems
Compiler and runtime support for enabling reduction computations on heterogeneous systems
Concurrency and Computation: Practice & Experience
Optimizing MapReduce for GPUs with effective shared memory usage
Proceedings of the 21st international symposium on High-Performance Parallel and Distributed Computing
A Map-Reduce Based Framework for Heterogeneous Processing Element Cluster Environments
CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
Optimizing dataflow applications on heterogeneous environments
Cluster Computing
Parallel rough set based knowledge acquisition using MapReduce from big data
Proceedings of the 1st International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications
Hierarchical merge for scalable MapReduce
Proceedings of the 2012 workshop on Management of big data systems
HSim: A MapReduce simulator in enabling Cloud Computing
Future Generation Computer Systems
Accelerating MapReduce on a coupled CPU-GPU architecture
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Multimedia Applications and Security in MapReduce: Opportunities and Challenges
Concurrency and Computation: Practice & Experience
G-Hadoop: MapReduce across distributed data centers for data-intensive computing
Future Generation Computer Systems
Assessing MapReduce for Internet Computing: A Comparison of Hadoop and BitDew-MapReduce
GRID '12 Proceedings of the 2012 ACM/IEEE 13th International Conference on Grid Computing
Accelerating text mining workloads in a MapReduce-based distributed GPU environment
Journal of Parallel and Distributed Computing
Cogset: a high performance MapReduce engine
Concurrency and Computation: Practice & Experience
Parallel approaches to machine learning-A comprehensive survey
Journal of Parallel and Distributed Computing
Grex: An efficient MapReduce framework for graphics processing units
Journal of Parallel and Distributed Computing
Makeflow: a portable abstraction for data intensive computing on clusters, clouds, and grids
Proceedings of the 1st ACM SIGMOD Workshop on Scalable Workflow Execution Engines and Technologies
Tiled-MapReduce: Efficient and Flexible MapReduce Processing on Multicore with Tiling
ACM Transactions on Architecture and Code Optimization (TACO)
Producer-Consumer: the programming model for future many-core processors
ARCS'13 Proceedings of the 26th international conference on Architecture of Computing Systems
OWL: cooperative thread array aware scheduling techniques for improving GPGPU performance
Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems
Comparison based sorting for systems with multiple GPUs
Proceedings of the 6th Workshop on General Purpose Processor Using Graphics Processing Units
Valar: a benchmark suite to study the dynamic behavior of heterogeneous systems
Proceedings of the 6th Workshop on General Purpose Processor Using Graphics Processing Units
Active disk meets flash: a case for intelligent SSDs
Proceedings of the 27th international ACM conference on International conference on supercomputing
Scaling large-data computations on multi-GPU accelerators
Proceedings of the 27th international ACM conference on International conference on supercomputing
Orchestrated scheduling and prefetching for GPGPUs
Proceedings of the 40th Annual International Symposium on Computer Architecture
Proceedings of the 2013 ACM SIGSIM conference on Principles of advanced discrete simulation
MapReduce with communication overlap (MaRCO)
Journal of Parallel and Distributed Computing
Cloud MapReduce for particle filter-based data assimilation for wildfire spread simulation
Proceedings of the High Performance Computing Symposium
Data-Intensive Cloud Computing: Requirements, Expectations, Challenges, and Solutions
Journal of Grid Computing
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
ACM SIGOPS 24th Symposium on Operating Systems Principles
Dandelion: a compiler and runtime for heterogeneous systems
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
Neither more nor less: optimizing thread-level parallelism for GPGPUs
PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
Evaluating integrated graphics processors for data center workloads
Proceedings of the Workshop on Power-Aware Computing and Systems
Accelerate MapReduce on GPUs with multi-level reduction
Proceedings of the 5th Asia-Pacific Symposium on Internetware
Efficient co-processor utilization in database query processing
Information Systems
Parallel graph processing on graphics processors made easy
Proceedings of the VLDB Endowment
Why it is time for a HyPE: a hybrid query processing engine for efficient GPU coprocessing in DBMS
Proceedings of the VLDB Endowment
A locality-aware memory hierarchy for energy-efficient GPU architectures
Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture
Data-parallel finite-state machines
Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
Red Fox: An Execution Environment for Relational Query Processing on GPUs
Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization
Achieving Accountable MapReduce in cloud computing
Future Generation Computer Systems
A compound OpenMP/MPI program development toolkit for hybrid CPU/GPU clusters
The Journal of Supercomputing
CPU+GPU scheduling with asymptotic profiling
Parallel Computing
International Journal of Approximate Reasoning
Hi-index | 0.00 |
We design and implement Mars, a MapReduce framework, on graphics processors (GPUs). MapReduce is a distributed programming framework originally proposed by Google for the ease of development of web search applications on a large number of commodity CPUs. Compared with CPUs, GPUs have an order of magnitude higher computation power and memory bandwidth, but are harder to program since their architectures are designed as a special-purpose co-processor and their programming interfaces are typically for graphics applications. As the first attempt to harness GPU's power for MapReduce, we developed Mars on an NVIDIA G80 GPU, which contains over one hundred processors, and evaluated it in comparison with Phoenix, the state-of-the-art MapReduce framework on multi-core CPUs. Mars hides the programming complexity of the GPU behind the simple and familiar MapReduce interface. It is up to 16 times faster than its CPU-based counterpart for six common web applications on a quad-core machine.