Optimal parallel merging and sorting without memory conflicts
IEEE Transactions on Computers
Knapsack problems: algorithms and computer implementations
Knapsack problems: algorithms and computer implementations
A Parallel Time/Hardware Tradeoff T.H=O(2/sup n/2/) for the Knapsack Problem
IEEE Transactions on Computers
A parallel two-list algorithm for the knapsack problem
Parallel Computing
Computing Partitions with Applications to the Knapsack Problem
Journal of the ACM (JACM)
OpenMP: An Industry-Standard API for Shared-Memory Programming
IEEE Computational Science & Engineering
The Dynamic and Stochastic Knapsack Problem with Random Sized Items
Operations Research
Optimal parallel algorithm for the knapsack problem without memory conflicts
Journal of Computer Science and Technology
A Parallel Algorithm for the Knapsack Problem
IEEE Transactions on Computers
An optimal parallelization of the two-list algorithm of cost O(2n/2)
Parallel Computing
A performance study of general-purpose applications on graphics processors using CUDA
Journal of Parallel and Distributed Computing
Parallel processing of matrix multiplication in a CPU and GPU heterogeneous environment
VECPAR'06 Proceedings of the 7th international conference on High performance computing for computational science
Towards dense linear algebra for hybrid GPU accelerated manycore systems
Parallel Computing
Solving knapsack problems on GPU
Computers and Operations Research
A CPU-GPU hybrid approach for the unsymmetric multifrontal method
Parallel Computing
GPU Implementation of the Branch and Bound Method for Knapsack Problems
IPDPSW '12 Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum
Parallel solution of the subset-sum problem: an empirical study
Concurrency and Computation: Practice & Experience
Hi-index | 0.00 |
The subset-sum problem is a well-known NP-complete decision problem. Many parallel algorithms have been developed to solve the problem within a reasonable computation time, and some of them have been implemented on a GPU. However, the GPU implementations of these parallel algorithms may fail to fully utilize all the CPU cores and the GPU resources at the same time. When the GPU performs some tasks, only one CPU core is used to control the GPU, all the rest of CPU cores are in idle state, this leads to large amounts of available CPU resources are wasted. This paper proposes a novel CPU-GPU cooperative implementation of a parallel two-list algorithm to efficiently solve the subset-sum problem in a heterogeneous CPU-GPU system, which enables the efficient utilization of all the available computational resources of both CPUs and GPUs. In order to find the most appropriate task distribution ratio between CPUs and GPUs, this paper establishes an optimal task distribution model. A series of experiments are conducted on two different hardware platforms. The experimental results show that the CPU-GPU cooperative implementation produces a speedup factor of 9.2 over the best sequential implementation, achieves up to 96.3% performance improvement over the optimized CPU-only implementation, and yields up to 25.7% performance improvement over the optimized GPU-only implementation.