vllm.core.block.utils
Block manager utils.
check_no_caching_or_swa_for_blockmgr_encdec
¶
check_no_caching_or_swa_for_blockmgr_encdec(
block_mgr, seq_group: SequenceGroup
) -> None
Enforce that prefix caching & sliding-window attention (SWA) are currently unsupported specifically for encoder/decoder models.
Raises NotImplementedError if unsupported scenario is detected.
Arguments:
- block_mgr: BlockSpaceManager instance
- seq_group: SequenceGroup passed to block_mgr