Architectural considerations for a new generation of protocols
SIGCOMM '90 Proceedings of the ACM symposium on Communications architectures & protocols
Active messages: a mechanism for integrated communication and computation
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Improving IPC by kernel design
SOSP '93 Proceedings of the fourteenth ACM symposium on Operating systems principles
Parallel programming in Split-C
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
The Concert signature representation: IDL as intermediate language
IDL '94 Proceedings of the workshop on Interface definition languages
Fortran M: a language for modular parallel programming
Journal of Parallel and Distributed Computing
Implementing global memory management in a workstation cluster
SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
Exokernel: an operating system architecture for application-level resource management
SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
Using annotated interface definitions to optimize RPC
SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
Microkernels meet recursive virtual machines
OSDI '96 Proceedings of the second USENIX symposium on Operating systems design and implementation
An implementation of the Hamlyn sender-managed interface architecture
OSDI '96 Proceedings of the second USENIX symposium on Operating systems design and implementation
Measuring the performance of communication middleware on high-speed networks
Conference proceedings on Applications, technologies, architectures, and protocols for computer communications
Flick: a flexible, optimizing IDL compiler
Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation
Implementing remote procedure calls
ACM Transactions on Computer Systems (TOCS)
Khazana: An Infrastructure for Building Distributed Services
ICDCS '98 Proceedings of the The 18th International Conference on Distributed Computing Systems
CC++: A Declarative Concurrent Object Oriented Programming Notation
CC++: A Declarative Concurrent Object Oriented Programming Notation
Object-oriented components for high-speed network programming
COOTS'95 Proceedings of the USENIX Conference on Object-Oriented Technologies on USENIX Conference on Object-Oriented Technologies (COOTS)
Cheating the I/O bottleneck: network storage with Trapeze/Myrinet
ATEC '98 Proceedings of the annual conference on USENIX Annual Technical Conference
Hi-index | 0.00 |
The author of a distributed system is often faced with a dilemma when writing the system's communication code. If the code is written by hand (e.g., using Active Messages) or partly by hand (e.g., using mpi) then the speed of the application may be maximized, but the human effort required to implement and maintain the system is greatly increased. On the other hand, if the code is generated using a high-level tool (e.g., a corba idl compiler) then programmer effort will be reduced, but the performance of the application may be intolerably poor. The tradeoff between system performance and development effort arises because existing communication middleware is inefficient, imposes excessive presentation layer overhead, and therefore fails to expose much of the underlying network performance to application code. Moreover, there is often a mismatch between the desired communication style of the application (e.g., asynchronous message passing) and the communication style of the code produced by an idl compiler (synchronous remote procedure call). We believe that this need not be the case, but that established optimizing compiler technology can be applied and extended to attack these domain-specific problems. We have implemented Flick, a flexible and optimizing idl compiler, and are using it to explore techniques for producing high-performance code for distributed and parallel applications. Flick produces optimized code for marshaling and unmarshaling data; experiments show that Flick-generated stubs can marshal data between 2 and 17 times as fast as stubs produced by other idl compilers. Further, because Flick is implemented as a "kit" of components, it is possible to extend the compiler to produce stylized code for many different application interfaces and underlying transport layers. In this paper we outline a novel approach for producing "decomposed" stubs for a distributed global memory service.