Principles of distributed database systems
Principles of distributed database systems
The Vesta parallel file system
ACM Transactions on Computer Systems (TOCS)
The Galley parallel file system
Parallel Computing - Special double issue: parallel I/O
World Wide Web Journal - Special issue on XML: principles, tools, and techniques
The Gamma Database Machine Project
IEEE Transactions on Knowledge and Data Engineering
Scalable, Parallel, Scientific Databases
SSDBM '98 Proceedings of the 10th International Conference on Scientific and Statistical Database Management
Parallel netCDF: A High-Performance Scientific I/O Interface
Proceedings of the 2003 ACM/IEEE conference on Supercomputing
Connections: using context to enhance file search
Proceedings of the twentieth ACM symposium on Operating systems principles
Provenance-aware storage systems
ATEC '06 Proceedings of the annual conference on USENIX '06 Annual Technical Conference
Efficient guaranteed disk request scheduling with fahrrad
Proceedings of the 3rd ACM SIGOPS/EuroSys European Conference on Computer Systems 2008
RTAS '08 Proceedings of the 2008 IEEE Real-Time and Embedded Technology and Applications Symposium
A general-purpose file system for secondary storage
AFIPS '65 (Fall, part I) Proceedings of the November 30--December 1, 1965, fall joint computer conference, part I
Spyglass: fast, scalable metadata search for large-scale storage systems
FAST '09 Proccedings of the 7th conference on File and storage technologies
Hierarchical file systems are dead
HotOS'09 Proceedings of the 12th conference on Hot topics in operating systems
Issues in automatic provenance collection
IPAW'06 Proceedings of the 2006 international conference on Provenance and Annotation of Data
Just-in-time analytics on large file systems
FAST'11 Proceedings of the 9th USENIX conference on File and stroage technologies
A desktop interface over distributed document repositories
Proceedings of the 15th International Conference on Extending Database Technology
Hi-index | 0.00 |
File systems are the backbone of large-scale data processing for scientific applications. Motivated by the need to provide an extensible and flexible framework beyond the abstractions provided by API libraries for files to manage and analyze large-scale data, we are developing Damasc, an enhanced file system where rich data management services for scientific computing are provided as a native part of the file system. This paper presents our vision for Damasc, a performant file system that would allow scientists or even casual users to pose declarative queries and updates over views of underlying files that are stored in their native bytestream format. In Damasc, a configurable layer is added on top of the file system to expose the contents of files in a logical data model through which views can be defined and used for queries and updates. The logical data model and views are leveraged to optimize access to files through caching and self-organizing indexing. In addition, provenance capture and analysis to file access is also built into Damasc. We describe the salient features of our proposal and discuss how it can benefit the development of scientific code.