Skip to content

vllm.v1.attention.backends

Modules:

Name Description
cpu_attn
flash_attn

Attention layer with FlashAttention.

flashinfer

Attention layer with FlashInfer.

flex_attention

Attention layer with FlexAttention.

linear_attn
mamba1_attn
mamba2_attn
mamba_attn
mla
pallas
rocm_aiter_fa

Attention layer with AiterFlashAttention.

short_conv_attn
tree_attn

Attention layer with TreeAttention.

triton_attn

Attention layer with PagedAttention and Triton prefix prefill.

utils
xformers

Attention layer with XFormersAttention.