On the translation of relational queries into iterative programs
ACM Transactions on Database Systems (TODS)
Parallel database systems: the future of high performance database systems
Communications of the ACM
FPCA '93 Proceedings of the conference on Functional programming languages and computer architecture
Reducing indirect function call overhead in C++ programs
POPL '94 Proceedings of the 21st ACM SIGPLAN-SIGACT symposium on Principles of programming languages
An implementation technique for database query languages
ACM Transactions on Database Systems (TODS)
A study of devirtualization techniques for a Java Just-In-Time compiler
OOPSLA '00 Proceedings of the 15th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
A history and evaluation of System R
Communications of the ACM
A relational model of data for large shared data banks
Communications of the ACM
Shortcut fusion for accumulating parameters & zip-like functions
Proceedings of the seventh ACM SIGPLAN international conference on Functional programming
Implementation techniques for main memory database systems
SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
Deforestation: Transforming Programs to Eliminate Trees
ESOP '88 Proceedings of the 2nd European Symposium on Programming
Translating Aggregate Queries into Iterative Programs
VLDB '86 Proceedings of the 12th International Conference on Very Large Data Bases
Optimization of Object-Oriented Programs Using Static Class Hierarchy Analysis
ECOOP '95 Proceedings of the 9th European Conference on Object-Oriented Programming
Efficient evaluation of XQuery over streaming data
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Accelerator: using data parallelism to program GPUs for general-purpose uses
Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Dryad: distributed data-parallel programs from sequential building blocks
Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007
Stream fusion: from lists to streams to nothing at all
ICFP '07 Proceedings of the 12th ACM SIGPLAN international conference on Functional programming
Lost in translation: formalizing proposed extensions to c#
Proceedings of the 22nd annual ACM SIGPLAN conference on Object-oriented programming systems and applications
Confessions of a used programming language salesman
Proceedings of the 22nd annual ACM SIGPLAN conference on Object-oriented programming systems and applications
The BEA/XQRL streaming XQuery processor
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Pig latin: a not-so-foreign language for data processing
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Automated Black Box Testing Tool for a Parallel Programming Library
ICST '09 Proceedings of the 2009 International Conference on Software Testing Verification and Validation
Analysis of imperative XML programs
Information Systems
Distributed aggregation for data-parallel computing: interfaces and implementations
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
HadoopToSQL: a mapReduce query optimizer
Proceedings of the 5th European conference on Computer systems
FlumeJava: easy, efficient data-parallel pipelines
PLDI '10 Proceedings of the 2010 ACM SIGPLAN conference on Programming language design and implementation
OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
Manimal: relational optimization for data-intensive programs
Procceedings of the 13th International Workshop on the Web and Databases
Nectar: automatic management of data and computation in datacenters
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
Optimizing data shuffling in data-parallel computation by understanding user-defined functions
NSDI'12 Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation
Putting a "big-data" platform to good use: training kinect
Proceedings of the 21st international symposium on High-Performance Parallel and Distributed Computing
Spotting code optimizations in data-parallel pipelines through PeriSCOPE
OSDI'12 Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation
Just-in-time compilation for SQL query processing
Proceedings of the VLDB Endowment
Hi-index | 0.00 |
Declarative queries enable programmers to write data manipulation code without being aware of the underlying data structure implementation. By increasing the level of abstraction over imperative code, they improve program readability and, crucially, create opportunities for automatic parallelization and optimization. For example, the Language Integrated Query (LINQ) extensions to C# allow the same declarative query to process in-memory collections, and datasets that are distributed across a compute cluster. However, our experiments show that the serial performance of declarative code is several times slower than the equivalent hand-optimized code, because it is implemented using run-time abstractions---such as iterators---that incur overhead due to virtual function calls and superfluous instructions. To address this problem, we have developed Steno, which uses a combination of novel and well-known techniques to generate code for declarative queries that is almost as efficient as hand-optimized code. Steno translates a declarative LINQ query into type-specialized, inlined and loop-based imperative code. It eliminates chains of iterators from query execution, and optimizes nested queries. We have implemented Steno for uniprocessor, multiprocessor and distributed computing platforms, and show that, for a real-world distributed job, it can almost double the speed of end-to-end execution.