Optimizing Main-Memory Join on Modern Hardware
IEEE Transactions on Knowledge and Data Engineering
GAMMA - A High Performance Dataflow Database Machine
VLDB '86 Proceedings of the 12th International Conference on Very Large Data Bases
Cache Conscious Algorithms for Relational Query Processing
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
DIRECT - a multiprocessor organization for supporting relational data base management systems
ISCA '78 Proceedings of the 5th annual symposium on Computer architecture
Efficient relational database management using graphics processors
DaMoN '05 Proceedings of the 1st international workshop on Data management on new hardware
Measuring the gap between FPGAs and ASICs
Proceedings of the 2006 ACM/SIGDA 14th international symposium on Field programmable gate arrays
Dryad: distributed data-parallel programs from sequential building blocks
Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007
Multiprocessor hash-based join algorithms
VLDB '85 Proceedings of the 11th international conference on Very Large Data Bases - Volume 11
Optimization of sparse matrix-vector multiplication on emerging multicore platforms
Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Relational joins on graphics processors
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Relational query coprocessing on graphics processors
ACM Transactions on Database Systems (TODS)
Sort vs. Hash revisited: fast join implementation on modern multi-core CPUs
Proceedings of the VLDB Endowment
Streams on wires: a query compiler for FPGAs
Proceedings of the VLDB Endowment
FPMR: MapReduce framework on FPGA
Proceedings of the 18th annual ACM/SIGDA international symposium on Field programmable gate arrays
Accelerating SQL database operations on a GPU with CUDA
Proceedings of the 3rd Workshop on General-Purpose Computation on Graphics Processing Units
Conservation cores: reducing the energy of mature computations
Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
Glacier: a query-to-hardware compiler
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Single-Chip Heterogeneous Computing: Does the Future Include Custom Logic, FPGAs, and GPGPUs?
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Copperhead: compiling an embedded data parallel language
Proceedings of the 16th ACM symposium on Principles and practice of parallel programming
Dark silicon and the end of multicore scaling
Proceedings of the 38th annual international symposium on Computer architecture
Toward Dark Silicon in Servers
IEEE Micro
An Implementation of Handshake Join on FPGA
ICNC '11 Proceedings of the 2011 Second International Conference on Networking and Computing
A Heterogeneous Parallel Framework for Domain-Specific Languages
PACT '11 Proceedings of the 2011 International Conference on Parallel Architectures and Compilation Techniques
Portable parallel performance from sequential, productive, embedded domain-specific languages
Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming
Design, integration and implementation of the DySER hardware accelerator into OpenSPARC
HPCA '12 Proceedings of the 2012 IEEE 18th International Symposium on High-Performance Computer Architecture
Exploring many-core design templates for FPGAs and ASICs
International Journal of Reconfigurable Computing - Special issue on Selected Papers from the International Conference on Reconfigurable Computing and FPGAs (ReConFig'10)
DaMoN '12 Proceedings of the Eighth International Workshop on Data Management on New Hardware
Kernel Weaver: Automatically Fusing Database Primitives for Efficient GPU Computation
MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
ACM SIGOPS 24th Symposium on Operating Systems Principles
Dandelion: a compiler and runtime for heterogeneous systems
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
Meet the walkers: accelerating index traversals for in-memory databases
Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture
Q100: the architecture and design of a database processing unit
Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
Hi-index | 0.00 |
We present LINQits, a flexible hardware template that can be mapped onto programmable logic or ASICs in a heterogeneous system-on-chip for a mobile device or server. Unlike fixed-function accelerators, LINQits accelerates a domain-specific query language called LINQ. LINQits does not provide coverage for all possible applications---however, existing applications (re-)written with LINQ in mind benefit extensively from hardware acceleration. Furthermore, the LINQits framework offers a graceful and transparent migration path from software to hardware. LINQits is prototyped on a 2W heterogeneous SoC called the ZYNQ processor, which combines dual ARM A9 processors with an FPGA on a single die in 28nm silicon technology. Our physical measurements show that LINQits improves energy efficiency by 8.9 to 30.6 times and performance by 10.7 to 38.1 times compared to optimized, multithreaded C programs running on conventional ARM A9 processors.