Language features for flexible handling of exceptions in information systems
ACM Transactions on Database Systems (TODS)
Query evaluation techniques for large databases
ACM Computing Surveys (CSUR)
POPL '01 Proceedings of the 28th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Exception handling: issues and a proposed notation
Communications of the ACM
Database relations with null values
PODS '82 Proceedings of the 1st ACM SIGACT-SIGMOD symposium on Principles of database systems
Reducing the Braking Distance of an SQL Query Engine
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Monads for Functional Programming
Advanced Functional Programming, First International Spring School on Advanced Functional Programming Techniques-Tutorial Text
Interpreting the data: Parallel analysis with Sawzall
Scientific Programming - Dynamic Grids and Worldwide Computing
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Pig latin: a not-so-foreign language for data processing
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
SCOPE: easy and efficient parallel processing of massive data sets
Proceedings of the VLDB Endowment
Hive: a warehousing solution over a map-reduce framework
Proceedings of the VLDB Endowment
OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
A platform for eXtreme analytics
IBM Journal of Research and Development
Hi-index | 0.00 |
We present an approach to declaratively manage run-time errors in data-intensive applications. When large volumes of raw data meet complex third-party libraries, deterministic run-time errors become likely, and existing query processors typically stop without returning a result when a run-time error occurs. The ability to degrade gracefully in the presence of run-time errors, and partially execute jobs, is typically limited to specific operators such as bulkloading. We generalize this concept to all operators of a query processing system, introducing a novel data type "partial result with errors" and corresponding operators. We show how to extend existing error-unaware operators to support this type, and as an added benefit, eliminate side-effect based error reporting. We use declarative specifications of acceptable results to control the semantics of error-aware operators. We have incorporated our approach into a declarative query processing system, which compiles the language constructs into instrumented execution plans for clusters of machines. We experimentally validate that the instrumentation overhead is below 20% in microbenchmarks, and not detectable when running I/O-intensive workloads.