vllm.attention.ops
Modules:
Name | Description |
---|---|
blocksparse_attention |
|
chunked_prefill_paged_decode |
|
flashmla |
|
hpu_paged_attn |
|
ipex_attn |
|
merge_attn_states |
|
nki_flash_attn |
|
paged_attn |
|
pallas_kv_cache_update |
|
prefix_prefill |
|
rocm_aiter_mla |
|
rocm_aiter_paged_attn |
|
triton_decode_attention |
Memory-efficient attention for decoding. |
triton_flash_attention |
Fused Attention |
triton_merge_attn_states |
|
triton_unified_attention |
|