AMD GFX906
Specs
| Name | Architecture | LLVM target name | VRAM (GiB) | Compute Units | Wavefront Size | LDS (KiB) | L3 Cache (MiB) | L2 Cache (MiB) | L1 Vector Cache (KiB) | L1 Scalar Cache (KiB) | L1 Instruction Cache (KiB) | VGPR File (KiB) | SGPR File (KiB) | GFXIP Major version | GFXIP Minor version |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| MI60 | GCN5.1 | gfx906 | 32 | 64 | 64 | 64 | 4 | 16 | 16 per 3 CUs | 32 per 3 CUs | 256 | 12.5 | 9 | 0 | |
| MI50 (32GB) | GCN5.1 | gfx906 | 32 | 60 | 64 | 64 | 4 | 16 | 16 per 3 CUs | 32 per 3 CUs | 256 | 12.5 | 9 | 0 | |
| MI50 (16GB) | GCN5.1 | gfx906 | 16 | 60 | 64 | 64 | 4 | 16 | 16 per 3 CUs | 32 per 3 CUs | 256 | 12.5 | 9 | 0 | |
| Radeon Pro VII | GCN5.1 | gfx906 | 16 | 60 | 64 | 64 | 4 | 16 | 16 per 3 CUs | 32 per 3 CUs | 256 | 12.5 | 9 | 0 | |
| Radeon VII | GCN5.1 | gfx906 | 16 | 60 | 64 | 64 per CU | 4 | 16 | 16 per 3 CUs | 32 per 3 CUs | 256 | 12.5 | 9 | 0 |
📄️ K8S GPU Operator
Мануал как затащить mi50 под куб.
📄️ Tools
atitool
🗃️ vLLM
3 items
📄️ Perf tuning
Changing smcPPTable/TdcLimitGfx 350 => 150 reduced the hotspot by 10+- degrees with almost no drop in performance in vllm