GPUTeraSort: high performance graphics co-processor sorting for large database management
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Evaluating MapReduce for Multi-core and Multiprocessor Systems
HPCA '07 Proceedings of the 2007 IEEE 13th International Symposium on High Performance Computer Architecture
Mars: a MapReduce framework on graphics processors
Proceedings of the 17th international conference on Parallel architectures and compilation techniques
CellMR: A framework for supporting mapreduce on asymmetric cell-based clusters
IPDPS '09 Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing
MapReduce: a flexible data processing tool
Communications of the ACM - Amir Pnueli: Ahead of His Time
FPMR: MapReduce framework on FPGA
Proceedings of the 18th annual ACM/SIGDA international symposium on Field programmable gate arrays
Research on k-means Clustering Algorithm: An Improved k-means Clustering Algorithm
IITSI '10 Proceedings of the 2010 Third International Symposium on Intelligent Information Technology and Security Informatics
MapCG: writing parallel program portable between CPU and GPU
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Tiled-MapReduce: optimizing resource usages of data-parallel applications on multicore with tiling
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
CUDA by Example: An Introduction to General-Purpose GPU Programming
CUDA by Example: An Introduction to General-Purpose GPU Programming
Hybrid Map Task Scheduling for GPU-Based Heterogeneous Clusters
CLOUDCOM '10 Proceedings of the 2010 IEEE Second International Conference on Cloud Computing Technology and Science
Hadoop in Action
Using Shared Memory to Accelerate MapReduce on Graphics Processing Units
IPDPS '11 Proceedings of the 2011 IEEE International Parallel & Distributed Processing Symposium
Multi-GPU MapReduce on GPU Clusters
IPDPS '11 Proceedings of the 2011 IEEE International Parallel & Distributed Processing Symposium
Optimizing MapReduce for GPUs with effective shared memory usage
Proceedings of the 21st international symposium on High-Performance Parallel and Distributed Computing
Grex: An efficient MapReduce framework for graphics processing units
Journal of Parallel and Distributed Computing
Hi-index | 0.00 |
With Graphics Processing Units (GPUs) becoming more and more popular in general purpose computing, more attentions have been paid on building a framework to provide convenient interfaces for GPU programming. MapReduce can greatly simplify the programming for data-parallel applications in cloud computing environment, and it is also naturally suitable for GPUs. However, there are some problems in recent reduction-based MapReduce implementation on GPUs. Its performance is dramatically degraded when handling massive distinct keys because the massive data cannot be stored in tiny shared memory entirely. A new MapReduce framework on GPUs, called Jupiter, is proposed with continuous reduction structure. Two improvements are supported in Jupiter, a multi-level reduction scheme tailored for GPU memory hierarchy and a frequency-based cache policy on key-value pairs in shared memory. Shared memories are utilized efficiently for various data-parallel applications whether involving little or abundant distinct keys. Experiments show that Jupiter can achieve up to 3x speedup over the original reduction-based GPU MapReduce framework on the applications with lots of distinct keys.