vllm.platforms.zen_cpu ¶
ZenCpuPlatform ¶
Bases: CpuPlatform
CPU platform with AMD Zen (ZenDNN/zentorch) optimizations.
Model-load time (dispatch_cpu_unquantized_gemm in layers/utils.py): - Routes linear ops to zentorch_linear_unary. - When VLLM_ZENTORCH_WEIGHT_PREPACK=1 (default), eagerly prepacks weights via zentorch_weight_prepack_for_linear.
Source code in vllm/platforms/zen_cpu.py
_apply_pytorch_backports classmethod ¶
Backport PyTorch mainline fixes missing in 2.10.
PyTorch 2.10 has a bug in FxGraphCachePickler.dumps that doesn't catch ValueError, causing torch.compile cache misses. Remove this once we drop PyTorch 2.10 support. PT mainline already has this fix.
Source code in vllm/platforms/zen_cpu.py
_patch_fxgraphcache_pickle classmethod ¶
Backport mainline ValueError fix to FxGraphCachePickler.dumps().