The design of the UNIX operating system
The design of the UNIX operating system
A case for redundant arrays of inexpensive disks (RAID)
SIGMOD '88 Proceedings of the 1988 ACM SIGMOD international conference on Management of data
RAID: high-performance, reliable secondary storage
ACM Computing Surveys (CSUR)
Computer architecture (2nd ed.): a quantitative approach
Computer architecture (2nd ed.): a quantitative approach
FCCM '07 Proceedings of the 15th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Characterizing application sensitivity to OS interference using kernel-level noise injection
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Runtime Filesystem Support for Reconfigurable FPGA Hardware Processes in BORPH
FCCM '08 Proceedings of the 2008 16th International Symposium on Field-Programmable Custom Computing Machines
RECONFIG '08 Proceedings of the 2008 International Conference on Reconfigurable Computing and FPGAs
A Hardware Filesystem Implementation for High-Speed Secondary Storage
RECONFIG '08 Proceedings of the 2008 International Conference on Reconfigurable Computing and FPGAs
Hi-index | 0.01 |
Modern High-End Computing systems frequently include FPGAs as compute accelerators. These programmable logic devices now support disk controller IP cores which offer the ability to introduce new, innovative functionalities that, previously, were not practical. This article describes one such innovation: a filesystem implemented in hardware. This has the potential of improving the performance of data-intensive applications by connecting secondary storage directly to FPGA compute accelerators. To test the feasibility of this idea, a Hardware Filesystem was designed with four basic operations (open, read, write, and delete). Furthermore, multi-disk and RAID-0 (striping) support has been implemented as an option in the filesystem. A RAM Disk core was created to emulate a SATA disk drive so results on running FPGA systems could be readily measured. By varying the block size from 64 to 4096 bytes, it was found that 1024 bytes gave the best performance while using a very modest 7% of a Xilinx XC4VFX60's slices and only four (of the 232) BRAM blocks available.