Fermi GF100 GPU Architecture

Authors:
Craig M. Wittenbrink;Emmett Kilgariff;Arjun Prabhu
Affiliations:
Nvidia;Nvidia;Nvidia
Venue:
IEEE Micro
Year:
2011

Citing 0
Cited 12

Sabrewing: A lightweight architecture for combined floating-point and integer arithmetic

ACM Transactions on Architecture and Code Optimization (TACO) - HIPEAC Papers
An efficient mixed-precision, hybrid CPU-GPU implementation of a nonlinearly implicit one-dimensional particle-in-cell algorithm

Journal of Computational Physics
Lossless and lossy memory I/O link compression for improving performance of GPGPU workloads

Proceedings of the 21st international conference on Parallel architectures and compilation techniques
Fast box-counting algorithm on GPU

Computer Methods and Programs in Biomedicine
An optimized parallel IDCT on graphics processing units

Euro-Par'12 Proceedings of the 18th international conference on Parallel processing workshops
Exploring GPU architectures to accelerate semantic comparison for intention-based search

Proceedings of the 6th Workshop on General Purpose Processor Using Graphics Processing Units
A hardware unit for fast SAH-optimised BVH construction

ACM Transactions on Graphics (TOG) - SIGGRAPH 2013 Conference Proceedings
Efficient management of last-level caches in graphics processors for 3D scene rendering workloads

Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture
Portable and Transparent Host-Device Communication Optimization for GPGPU Environments

Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization
FlexTiles: a globally homogeneous but locally heterogeneous manycore architecture

Proceedings of the 6th Workshop on Rapid Simulation and Performance Evaluation: Methods and Tools
HARP: Harnessing inactive threads in many-core processors

ACM Transactions on Embedded Computing Systems (TECS) - Special Issue on Design Challenges for Many-Core Processors, Special Section on ESTIMedia'13 and Regular Papers
An efficient parallelization technique for x264 encoder on heterogeneous platforms consisting of CPUs and GPUs

Journal of Real-Time Image Processing

Quantified Score

Hi-index	0.01

Visualization

Abstract

The Fermi GF100 is a GPU architecture that provides several new capabilities beyond the Nvidia GT200 or Tesla architecture. The Fermi architecture offers up to 512 CUDA cores and special features for gaming and high-performance computing. This article describes the GPU's new capabilities for tessellation, physics processing, and computational graphics.