Designing and mining multi-terabyte astronomy archives: the Sloan Digital Sky Survey
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Optimization of Queries with User-defined Predicates
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning)
Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning)
Adaptive execution of variable-accuracy functions
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Efficient query evaluation on probabilistic databases
The VLDB Journal — The International Journal on Very Large Data Bases
Model-driven data acquisition in sensor networks
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Exploiting shared correlations in probabilistic databases
Proceedings of the VLDB Endowment
Online Sparse Matrix Gaussian Process Regression and Vision Applications
ECCV '08 Proceedings of the 10th European Conference on Computer Vision: Part I
Fast and Simple Relational Processing of Uncertain Data
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Database Support for Probabilistic Attributes and Tuples
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Probabilistic Inference over RFID Streams in Mobile Environments
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
AINTEC'06 Proceedings of the Second Asian international conference on Technologies for Advanced Heterogeneous Networks
CLARO: modeling and processing uncertain data streams
The VLDB Journal — The International Journal on Very Large Data Bases
Hi-index | 0.00 |
Uncertain data management has become crucial in many sensing and scientific applications. As user-defined functions (UDFs) become widely used in these applications, an important task is to capture result uncertainty for queries that evaluate UDFs on uncertain data. In this work, we provide a general framework for supporting UDFs on uncertain data. Specifically, we propose a learning approach based on Gaussian processes (GPs) to compute approximate output distributions of a UDF when evaluated on uncertain input, with guaranteed error bounds. We also devise an online algorithm to compute such output distributions, which employs a suite of optimizations to improve accuracy and performance. Our evaluation using both real-world and synthetic functions shows that our proposed GP approach can outperform the state-of-the-art sampling approach with up to two orders of magnitude improvement for a variety of UDFs.