Lithographic aerial image simulation with FPGA-based hardwareacceleration

Authors:
Jason Cong;Yi Zou
Affiliations:
University of California: Los Angeles, Los Angeles, CA;University of California: Los Angeles, Los Angeles, CA
Venue:
Proceedings of the 16th international ACM/SIGDA symposium on Field programmable gate arrays
Year:
2008

Citing 1
Cited 9

Fast optical and process proximity correction algorithms for integrated circuit manufacturing

Fast optical and process proximity correction algorithms for integrated circuit manufacturing

Synthesis of reconfigurable high-performance multicore systems

Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays
MC-Sim: an efficient simulation tool for MPSoC designs

Proceedings of the 2008 IEEE/ACM International Conference on Computer-Aided Design
FPGA-Based Hardware Acceleration of Lithographic Aerial Image Simulation

ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Automatic memory partitioning and scheduling for throughput and power optimization

Proceedings of the 2009 International Conference on Computer-Aided Design
Optical lithography simulation using wavelet transfor

ICCD'09 Proceedings of the 2009 IEEE international conference on Computer design
From OO to FPGA: fitting round objects into square hardware?

Proceedings of the ACM international conference on Object oriented programming systems languages and applications
Automatic memory partitioning and scheduling for throughput and power optimization

ACM Transactions on Design Automation of Electronic Systems (TODAES)
High-performance code generation for stencil computations on GPU architectures

Proceedings of the 26th ACM international conference on Supercomputing
Efficient compilation of CUDA kernels for high-performance computing on FPGAs

ACM Transactions on Embedded Computing Systems (TECS) - Special issue on application-specific processors

Quantified Score

Hi-index	0.00

Visualization

Abstract

Lithography simulation, as an essential step in design for manufacturability (DFM), is still far from computationally efficient. Most leading companies use large clusters of server computers to achieve acceptable turn-around time. Thus co-processor acceleration is very attractive for obtaining increased computational performance with reduced power consumption. This paper describes an implementation of a customized accelerator on FPGA using a polygon-based simulation model. An application-specific memory partitioning scheme is designed to meet the bandwidth requirements for a large number of processing elements. Deep loop pipelining and ping-pong buffer based function block pipelining are also implemented in our design. Initial results show a 15X speedup versus the software implementation running on a microprocessor, and more speedup is expected via further performance tuning. The implementation also leverages state-of-art C-to-RTL synthesis tools. At the same time, we also identified the need for manual architecture-level exploration for parallel implementations