Packet types: abstract specification of network protocol messages
Proceedings of the conference on Applications, Technologies, Architectures, and Protocols for Computer Communication
DataScript - A Specification and Scripting Language for Binary Data
GPCE '02 Proceedings of the 1st ACM SIGPLAN/SIGSOFT conference on Generative Programming and Component Engineering
PADS: a domain-specific language for processing ad hoc data
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
The next 700 data description languages
Conference record of the 33rd ACM SIGPLAN-SIGACT symposium on Principles of programming languages
PADS/ML: a functional data description language
Proceedings of the 34th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Hi-index | 0.00 |
Traditionally, types describe the internal data manipulated by programs. To accommodate the variety of desired data structures, language designers and type theorists have developed a wide variety of types and type constructors. But not all useful data is in programs; in fact, enormous amounts of it sit on disks or stream by on wires in a dizzying array of encodings and formats. It turns out that many of the types developed for internal data can be used to describe external data: tuples, records, unions, options, and lists come to mind as obvious examples. Perhaps more surprisingly, recursive types, singletons, functions, parametric polymorphism, and dependent types are relevant as well. Using types to describe external data leads naturally to the insight that we can reuse the same type to define an internal data structure and to generate parsing and printing functions to map between the two representations. The PADS project [1] has exploited this idea, building data description languages based on the type structure of C (PADS/C [3] and on ML (PADS/ML [5] and exploring the theoretical basis for such languages with the Data Description Calculus (DDC) [4]. Other groups have also leveraged this insight, most closely the work on DataScript [2] and PacketTypes [6]. Continuing the analogy, it turns out that other concepts from the types world are also relevant to ad hoc data processing, including generic programming, type inference, type isomorphisms, and subtyping.In this talk, I will describe the domain of ad hoc data processing and explain how types enable precise descriptions of such data. I will then explore the question of type inference, describing quantitative techniques we are currently developing to construct a description of ad hoc data given example instances.