vllm.v1.attention.backends
Modules:
Name | Description |
---|---|
cpu_attn | |
flash_attn | Attention layer with FlashAttention. |
flashinfer | Attention layer with FlashInfer. |
flex_attention | Attention layer with FlexAttention. |
linear_attn | |
mamba1_attn | |
mamba2_attn | |
mamba_attn | |
mla | |
pallas | |
rocm_aiter_fa | Attention layer with AiterFlashAttention. |
short_conv_attn | |
tree_attn | Attention layer with TreeAttention. |
triton_attn | Attention layer with PagedAttention and Triton prefix prefill. |
utils | |
xformers | Attention layer with XFormersAttention. |