Adaptive bitonic sorting: an optimal parallel algorithm for shared-memory machines
SIAM Journal on Computing
Fast computation of generalized Voronoi diagrams using graphics hardware
Proceedings of the 26th annual conference on Computer graphics and interactive techniques
Proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on Graphics hardware
Fast matrix multiplies using graphics hardware
Proceedings of the 2001 ACM/IEEE conference on Supercomputing
Hardware acceleration for spatial selections and joins
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Sparse matrix solvers on the GPU: conjugate gradients and multigrid
ACM SIGGRAPH 2003 Papers
Fast computation of database operations using graphics processors
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
An Efficient Program for Phylogenetic Inference Using Simulated Annealing
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 7 - Volume 08
Fast and approximate stream mining of quantiles and frequencies using graphics processors
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Exploring Graphics Processor Performance for General Purpose Applications
DSD '05 Proceedings of the 8th Euromicro Conference on Digital System Design
The potential of the cell processor for scientific computing
Proceedings of the 3rd conference on Computing frontiers
GPUTeraSort: high performance graphics co-processor sorting for large database management
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Streaming architectures and technology trends
SIGGRAPH '05 ACM SIGGRAPH 2005 Courses
Linear algebra operators for GPU implementation of numerical algorithms
SIGGRAPH '05 ACM SIGGRAPH 2005 Courses
Drug Design on the Cell BroadBand Engine
PACT '07 Proceedings of the 16th International Conference on Parallel Architecture and Compilation Techniques
Executing stream joins on the cell processor
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
GPU-ABiSort: optimal parallel sorting on stream architectures
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Initial experiences porting a bioinformatics application to a graphics processor
PCI'05 Proceedings of the 10th Panhellenic conference on Advances in Informatics
Communications of the ACM
Queue - Interoperability
Fine-grain parallelism using multi-core, Cell/BE, and GPU Systems
Parallel Computing
Kernel Weaver: Automatically Fusing Database Primitives for Efficient GPU Computation
MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture
Red Fox: An Execution Environment for Relational Query Processing on GPUs
Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization
Hi-index | 0.02 |
Decision Support System (DSS) workloads are known to be one of the most time-consuming database workloads that processes large data sets. Traditionally, DSS queries have been accelerated using large-scale multiprocessor. The topic addressed in this work is to analyze the benefits of using high-performance/low-cost processors such as the GPUs and the Cell/BE to accelerate DSS query execution. In order to overcome the programming effort of developing code for different architectures, in this work we explore the use of a platform, Rapidmind, which offers the possibility of executing the same program on both Cell/BE and GPUs. To achieve this goal we propose data-parallel versions of the original database scan and join algorithms. In our experimental results we compare the execution of three queries from the standard DSS benchmark TPC-H on two systems with two different GPU models, a system with the Cell/BE processor, and a system with dual quad-core Xeon processors. The results show that parallelism can be well exploited by the GPUs. The speedup values observed were up to 21x compared to a single processor system.