Nvidia Ampere Microarchitecture
(Redirected from Ampere (Microarchitecture))
Jump to navigation
Jump to search
An Nvidia Ampere Microarchitecture is a GPU Architecture for an Nvidia GPU family.
- Context:
- See: Hopper (Microarchitecture), Nvidia, TSMC, Samsung Electronics, 7 nm Process, 10 nm Process, GeForce 30 Series, DirectX#DirectX 12 Ultimate, Direct3D#Direct3D 12, High-Level Shader Language, OpenCL#OpenCL 3.0, OpenGL#OpenGL 4.6.
References
2023
- (Wikipedia, 2023) ⇒ https://en.wikipedia.org/wiki/Ampere_(microarchitecture) Retrieved:2023-5-8.
- Ampere is the codename for a graphics processing unit (GPU) microarchitecture developed by Nvidia as the successor to both the Volta and Turing architectures. It was officially announced on May 14, 2020 and is named after French mathematician and physicist André-Marie Ampère. Nvidia announced the Ampere architecture GeForce 30 series consumer GPUs at a GeForce Special Event on September 1, 2020. Nvidia announced the A100 80GB GPU at SC20 on November 16, 2020. Mobile RTX graphics cards and the RTX 3060 based on the Ampere architecture were revealed on January 12, 2021.
Nvidia announced Ampere's successor, Hopper, at GTC 2022, and "Ampere Next Next" for a 2024 release at GPU Technology Conference 2021.
- Ampere is the codename for a graphics processing unit (GPU) microarchitecture developed by Nvidia as the successor to both the Volta and Turing architectures. It was officially announced on May 14, 2020 and is named after French mathematician and physicist André-Marie Ampère. Nvidia announced the Ampere architecture GeForce 30 series consumer GPUs at a GeForce Special Event on September 1, 2020. Nvidia announced the A100 80GB GPU at SC20 on November 16, 2020. Mobile RTX graphics cards and the RTX 3060 based on the Ampere architecture were revealed on January 12, 2021.
2023
- (Wikipedia, 2023) ⇒ https://en.wikipedia.org/wiki/Ampere_(microarchitecture)#Chips Retrieved:2023-5-8.
Chips
- GA100[1]
- GA102
- GA103
- GA104
- GA106
- GA107
Comparison of Compute Capability: GP100 vs GV100 vs GA100[2]
GPU features | NVIDIA Tesla P100 | NVIDIA Tesla V100 | NVIDIA A100 |
---|---|---|---|
GPU codename | GP100 | GV100 | GA100 |
GPU architecture | NVIDIA Pascal | NVIDIA Volta | NVIDIA Ampere |
Compute capability | 6.0 | 7.0 | 8.0 |
Threads / warp | 32 | 32 | 32 |
Max warps / SM | 64 | 64 | 64 |
Max threads / SM | 2048 | 2048 | 2048 |
Max thread blocks / SM | 32 | 32 | 32 |
Max 32-bit registers / SM | 65536 | 65536 | 65536 |
Max registers / block | 65536 | 65536 | 65536 |
Max registers / thread | 255 | 255 | 255 |
Max thread block size | 1024 | 1024 | 1024 |
FP32 cores / SM | 64 | 64 | 64 |
Ratio of SM registers to FP32 cores | 1024 | 1024 | 1024 |
Shared Memory Size / SM | 64 KB | Configurable up to 96 KB | Configurable up to 164 KB |
Comparison of Precision Support Matrix[3][4]
Supported CUDA Core Precisions | Supported Tensor Core Precisions | |||||||||||||||
FP16 | FP32 | FP64 | INT1 | INT4 | INT8 | TF32 | BF16 | FP16 | FP32 | FP64 | INT1 | INT4 | INT8 | TF32 | BF16 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
NVIDIA Tesla P4 | No | Yes | Yes | No | No | Yes | No | No | No | No | No | No | No | No | No | No |
NVIDIA P100 | Yes | Yes | Yes | No | No | No | No | No | No | No | No | No | No | No | No | No |
NVIDIA Volta | Yes | Yes | Yes | No | No | Yes | No | No | Yes | No | No | No | No | No | No | No |
NVIDIA Turing | Yes | Yes | Yes | No | No | Yes | No | No | Yes | No | No | Yes | Yes | Yes | No | No |
NVIDIA A100 | Yes | Yes | Yes | No | No | Yes | No | Yes | Yes | No | Yes | Yes | Yes | Yes | Yes | Yes |
Legend:
- FPnn: floating point with nn bits
- INTn: integer with n bits
- INT1: binary
- TF32: TensorFloat32
- BF16: bfloat16
Comparison of Decode Performance
Concurrent streams | H.264 decode (1080p30) | H.265 (HEVC) decode (1080p30) | VP9 decode (1080p30) |
---|---|---|---|
V100 | 16 | 22 | 22 |
A100 | 75 | 157 | 108 |
- ↑ Morgan, Timothy Prickett (May 29, 2020). "Diving Deep Into The Nvidia Ampere GPU Architecture" (in en-US). https://www.nextplatform.com/2020/05/28/diving-deep-into-the-nvidia-ampere-gpu-architecture/. Retrieved March 24, 2022.
- ↑ "NVIDIA A100 Tensor Core GPU Architecture: Unprecedented Accerlation at Every Scale" (in en-US). https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Center/nvidia-ampere-architecture-whitepaper.pdf. Retrieved September 18, 2020.
- ↑ "NVIDIA Tensor Cores: Versatility for HPC & AI". https://www.nvidia.com/en-us/data-center/tensor-cores/.
- ↑ "Abstract". https://docs.nvidia.com/deeplearning/tensorrt/support-matrix/index.html.