vllm.lora.ops.xla_ops.lora_ops
bgmv_expand ¶
bgmv_expand(
inputs: Tensor,
lora_b_weights: Tensor,
output_tensor: Tensor,
lora_indices_tensor: Tensor,
add_inputs: bool = True,
)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
inputs | Tensor | Input tensor of shape [num_tokens, hidden_size]. | required |
lora_b_weights | Tensor | LoRA weights of shape [num_loras, lora_rank, hidden_size]. | required |
output_tensor | Tensor | output tensor of shape [num_tokens, hidden_size * num_slices]. | required |
lora_indices_tensor | Tensor | Tensor of shape [num_tokens] indicating which LoRA matrix to use for each token. | required |
add_inputs | bool | Whether or not to add the input tensor to the output tensor. | True |
Source code in vllm/lora/ops/xla_ops/lora_ops.py
bgmv_expand_slice ¶
bgmv_expand_slice(
inputs: Tensor,
lora_b_weights: Tensor,
output_tensor: Tensor,
lora_indices_tensor: Tensor,
slice_offset: int,
slice_size: int,
add_inputs: bool = True,
)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
inputs | Tensor | Input tensor of shape [num_tokens, hidden_size]. | required |
lora_b_weights | Tensor | LoRA weights of shape [num_loras, lora_rank, hidden_size]. | required |
output_tensor | Tensor | output tensor of shape [num_tokens, hidden_size * num_slices]. | required |
lora_indices_tensor | Tensor | Tensor of shape [num_tokens] indicating which LoRA matrix to use for each token. | required |
add_inputs | bool | Whether or not to add the input tensor to the output tensor. | True |
Source code in vllm/lora/ops/xla_ops/lora_ops.py
bgmv_jax ¶
bgmv_non_xla ¶
Source code in vllm/lora/ops/xla_ops/lora_ops.py
bgmv_shrink ¶
bgmv_shrink(
inputs: Tensor,
lora_b_weights: Tensor,
lora_indices_tensor: Tensor,
scaling: float = 1.0,
)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
inputs | Tensor | Input tensor of shape [num_tokens, hidden_size]. | required |
lora_b_weights | Tensor | LoRA weights of shape [num_loras, lora_rank, hidden_size]. | required |
output_tensor | Tensor | (Unused) output tensor (placeholder). | required |
lora_indices_tensor | Tensor | Tensor of shape [num_tokens] indicating which LoRA matrix to use for each token. | required |
scaling | float | Scalar multiplier applied to the output. | 1.0 |