Language support for processing distributed ad hoc data

Authors:
Kenny Q. Zhu;Daniel S. Dantas;Kathleen Fisher;Limin Jia;Yitzhak Mandelbaum;Vivek Pai;David Walker
Affiliations:
Shanghai Jiao Tong University, Shanghai, China;Princeton University, Princeton, NJ, USA;AT&T Labs Research, Florham Park, NJ, USA;University of Pennsylvania, Philadelphia, PA, USA;AT&T Labs Research, Florham Park, NJ, USA;Princeton University, Princeton, NJ, USA;Princeton University, Princeton, NJ, USA
Venue:
PPDP '09 Proceedings of the 11th ACM SIGPLAN conference on Principles and practice of declarative programming
Year:
2009

Citing 15
Cited 0

LUSTRE: a declarative language for real-time programming

POPL '87 Proceedings of the 14th ACM SIGACT-SIGPLAN symposium on Principles of programming languages
Implementation of the data-flow synchronous language SIGNAL

PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
Functional reactive animation

ICFP '97 Proceedings of the second ACM SIGPLAN international conference on Functional programming
The Theory of Fexprs is Trivial

Lisp and Symbolic Computation
Functional reactive programming from first principles

PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation
Guarded recursive datatype constructors

POPL '03 Proceedings of the 30th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Looking up data in P2P systems

Communications of the ACM
Issues in data stream management

ACM SIGMOD Record
Generics for the masses

Proceedings of the ninth ACM SIGPLAN international conference on Functional programming
PADS: a domain-specific language for processing ad hoc data

Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
PADS/ML: a functional data description language

Proceedings of the 34th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Democratizing content publication with coral

NSDI'04 Proceedings of the 1st conference on Symposium on Networked Systems Design and Implementation - Volume 1
User-friendly functional programming for web mashups

ICFP '07 Proceedings of the 12th ACM SIGPLAN international conference on Functional programming
Provenance as dependency analysis

DBPL'07 Proceedings of the 11th international conference on Database programming languages
A generic programming toolkit for PADS/ML: first-class upgrades for third-party developers

PADL'08 Proceedings of the 10th international conference on Practical aspects of declarative languages

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents the design, theory and implementation of Gloves, a domain-specific language that allows users to specify the provenance (the derivation history starting from the origins), syntax and semantic properties of collections of distributed data sources. In particular, Gloves specifications indicate where to locate desired data, how to obtain it, when to get it or to give up trying, and what format it will be in on arrival. The Gloves system compiles such specification into a suite of data-processing tools including an archiver, a provenance tracking system, a database loading tool, an alert system, an RSS feed generator and a debugging tool. In addition, the system generates description-specific libraries so that developers can create their own applications. Gloves also provides a generic infrastructure so that advanced users can build new tools applicable to any data source with a Gloves description. We show how Gloves may be used to specify data sources from two domains: CoMon, a monitoring system for PlanetLab's 800+ nodes, and Arrakis, a monitoring system for an AT&T web hosting service. We show experimentally that our system can scale to distributed systems the size of CoMon. Finally, we provide a denotational semantics for Gloves and use this semantics to prove two important theorems. The first shows that our denotational semantics respects the typing rules for the language, while the second demonstrates that our system correctly maintains the provenance.