vllm.attention.ops
Modules:
Name | Description |
---|---|
chunked_prefill_paged_decode | |
flashmla | |
merge_attn_states | |
nki_flash_attn | |
paged_attn | |
pallas_kv_cache_update | |
prefix_prefill | |
rocm_aiter_mla | |
rocm_aiter_paged_attn | |
triton_decode_attention | Memory-efficient attention for decoding. |
triton_flash_attention | Fused Attention |
triton_merge_attn_states | |
triton_unified_attention | |