Skip to content

vllm.models.deepseek_v4.nvidia.ops

NVIDIA-only (cutedsl/cutlass) kernels for DeepSeek V4.

These modules import cutlass/cutedsl at module top level, so they must not be imported on non-CUDA platforms. Callers should gate on vllm.utils.import_utils.has_cutedsl() before importing from here.

Modules:

Name Description
attention

DeepseekV4 MLA Attention Layer

fused_indexer_q_cutedsl
prepare_megamoe

Triton input-staging kernel for DeepSeek V4 MegaMoE.