Lightning-2: a high-performance display subsystem for PC clusters
Proceedings of the 28th annual conference on Computer graphics and interactive techniques
Scalable interactive volume rendering using off-the-shelf components
PVG '01 Proceedings of the IEEE 2001 symposium on parallel and large-data visualization and graphics
Next-generation visual supercomputing using PC clusters with volume graphics hardware devices
Proceedings of the 2001 ACM/IEEE conference on Supercomputing
Parallel Volume Rendering Using Binary-Swap Compositing
IEEE Computer Graphics and Applications
Sepia: Scalable 3D Compositing Using PCI Pamette
FCCM '99 Proceedings of the Seventh Annual IEEE Symposium on Field-Programmable Custom Computing Machines
A Sorting Classification of Parallel Rendering
A Sorting Classification of Parallel Rendering
Parallel application performance on shared high performance reconfigurable computing resources
Performance Evaluation - Performance modelling and evaluation of high-performance parallel and distributed systems
FCCM '05 Proceedings of the 13th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
A versatile, low latency HyperTransport core
Proceedings of the 2007 ACM/SIGDA 15th international symposium on Field programmable gate arrays
RAT: a methodology for predicting performance in application design migration to FPGAs
HPRCTA '07 Proceedings of the 1st international workshop on High-performance reconfigurable computing technology and applications: held in conjunction with SC07
PROCAMS '08 Proceedings of the 5th ACM/IEEE International Workshop on Projector camera systems
FCCM '09 Proceedings of the 2009 17th IEEE Symposium on Field Programmable Custom Computing Machines
Communication Performance Characterization for Reconfigurable Accelerator Design on the XD1000
RECONFIG '09 Proceedings of the 2009 International Conference on Reconfigurable Computing and FPGAs
Hi-index | 0.00 |
Reconfigurable computers usually provide a limited number of different memory resources, such as host memory, external memory, and on-chip memory with different capacities and communication characteristics. A key challenge for achieving high-performance with reconfigurable accelerators is the efficient utilization of the available memory resources. A detailed knowledge of the memories' parameters is key for generating an optimized communication layout. In this paper, we discuss a benchmarking environment for generating such a characterization. The environment is built on IMORC, our architectural template and onchip network for creating reconfigurable accelerators. We provide a characterization of the memory resources available on the XtremeData XD1000 reconfigurable computer. Based on this data, we present as a case study the implementation of a 3D image compositing accelerator that is able to double the frame rate of a parallel renderer.