NVIDIA Hopper H100
An NVIDIA Hopper H100 is a graphics chip/processor based on Hopper microarchitecture.
- Counter-Example(s):
- NVIDIA GH200.
- ...
- See: CUDA.
References
2023
- (Wikipedia, 2023) ⇒ https://en.wikipedia.org/wiki/Hopper_(microarchitecture)#Memory Retrieved:2023-7-14.
- The Nvidia Hopper H100 supports HBM3 and HBM2e memory up to 80 GB; the HMB3 memory system supports 3 TB/s, an increase over the Nvidia Ampere A100. Across the architecture, the L2 cache capacity and bandwidth were increased. Hopper allows CUDA compute kernels to utilize inline compression, including in individual memory allocation, although this feature does not reduce memory footprint. The compressor will automatically choose between several compression algorithms. Below is an example of memory being allocated and compressed in C++, using
libcuda.so
.<syntaxhighlight lang="cpp">
CUmemGenericAllocationHandle allocationHandle;
CUmemAllocationProp prop = {};
memset(prop, 0, sizeof(CUmemAllocationProp));
prop->type = CU_MEM_ALLOCATION_TYPE_PINNED;
prop->location.type = CU_MEM_LOCATION_TYPE_DEVICE;
prop->location.id = currentDevice;
prop->allocFlags.compressionType = CU_MEM_ALLOCATION_COMP_GENERIC;
cuMemCreate(&allocationHandle, size, &prop, 0);
</syntaxhighlight>
The Nvidia Hopper H100 increases the capacity of the combined L1 cache, texture cache, and shared memory to 256 KB. Like its predecessors, it combines L1 and texture caches into a unified cache designed to be a coalescing buffer. The attribute
cudaFuncAttributePreferredSharedMemoryCarveout
may be used to define the carveout of the L1 cache. Hopper introduces enhancements to NVLink through a new generation with faster overall communication bandwidth.
- The Nvidia Hopper H100 supports HBM3 and HBM2e memory up to 80 GB; the HMB3 memory system supports 3 TB/s, an increase over the Nvidia Ampere A100. Across the architecture, the L2 cache capacity and bandwidth were increased. Hopper allows CUDA compute kernels to utilize inline compression, including in individual memory allocation, although this feature does not reduce memory footprint. The compressor will automatically choose between several compression algorithms. Below is an example of memory being allocated and compressed in C++, using