User-Defined Table Operators: Enhancing Extensibility for ORDBMS
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
A Transactional Model for Long-Running Activities
VLDB '91 Proceedings of the 17th International Conference on Very Large Data Bases
Inter-Enterprise Collaborative Business Process Management
Proceedings of the 17th International Conference on Data Engineering
Experiences with MapReduce, an abstraction for large-scale computation
Proceedings of the 15th international conference on Parallel architectures and compilation techniques
Clustera: an integrated computation and data management system
Proceedings of the VLDB Endowment
PNUTS: Yahoo!'s hosted data serving platform
Proceedings of the VLDB Endowment
Data-Continuous SQL Process Model
OTM '08 Proceedings of the OTM 2008 Confederated International Conferences, CoopIS, DOA, GADA, IS, and ODBASE 2008. Part I on On the Move to Meaningful Internet Systems:
Scaling-Up and Speeding-Up Video Analytics Inside Database Engine
DEXA '09 Proceedings of the 20th International Conference on Database and Expert Systems Applications
Extend UDF Technology for Integrated Analytics
DaWaK '09 Proceedings of the 11th International Conference on Data Warehousing and Knowledge Discovery
Efficiently support MapReduce-like computation models inside parallel DBMS
IDEAS '09 Proceedings of the 2009 International Database Engineering & Applications Symposium
Extend core UDF framework for GPU-enabled analytical query evaluation
Proceedings of the 15th Symposium on International Database Engineering & Applications
Hi-index | 0.00 |
Running analytics computation inside a database engine through the use of UDFs (User Defined Functions) has been investigated, but not yet become a scalable approach due to several technical limitations. One limitation lies in the lack of generality for UDFs to express complex applications and to compose them with relational operators in SQL queries. Another limitation lies in the lack of systematic support for a UDF to cache relations initially for efficient computation in multi-calls. Further, having UDF execution interacted efficiently with query processing requires detailed system programming, which is often beyond the expertise of most application developers. To solve these problems, we extend the UDF technology in both semantic and system dimensions. We generalize UDF to support scalar, tuple as well as relation input and output, allow UDFs to be defined on the entire content of relations and allow the moderate-sized input relations to be cached in initially to avoid repeated retrieval. With such extension the generalized UDFs can be composed with other relational operators and thus integrated into queries naturally. Furthermore, based on the notion of invocation patterns, we provide focused system support for efficiently interacting UDF execution with query processing. We have taken the open-sourced PostgreSQL engine and a commercial and proprietary parallel database engine as our prototyping vehicles; we illustrated the performance, modeling power and usability of the proposed approach with the experimental results on both platforms.