vllm.lora.ops.xla_ops.lora_ops
bgmv_expand
¶
bgmv_expand(
inputs: Tensor,
lora_b_weights: Tensor,
output_tensor: Tensor,
lora_indices_tensor: Tensor,
add_inputs: bool = True,
)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
inputs
|
Tensor
|
Input tensor of shape [num_tokens, hidden_size]. |
required |
lora_b_weights
|
Tensor
|
LoRA weights of shape [num_loras, lora_rank, hidden_size]. |
required |
output_tensor
|
Tensor
|
output tensor of shape [num_tokens, hidden_size * num_slices]. |
required |
lora_indices_tensor
|
Tensor
|
Tensor of shape [num_tokens] indicating which LoRA matrix to use for each token. |
required |
add_inputs
|
bool
|
Whether or not to add the input tensor to the output tensor. |
True
|
Source code in vllm/lora/ops/xla_ops/lora_ops.py
bgmv_expand_slice
¶
bgmv_expand_slice(
inputs: Tensor,
lora_b_weights: Tensor,
output_tensor: Tensor,
lora_indices_tensor: Tensor,
slice_offset: int,
slice_size: int,
add_inputs: bool = True,
)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
inputs
|
Tensor
|
Input tensor of shape [num_tokens, hidden_size]. |
required |
lora_b_weights
|
Tensor
|
LoRA weights of shape [num_loras, lora_rank, hidden_size]. |
required |
output_tensor
|
Tensor
|
output tensor of shape [num_tokens, hidden_size * num_slices]. |
required |
lora_indices_tensor
|
Tensor
|
Tensor of shape [num_tokens] indicating which LoRA matrix to use for each token. |
required |
add_inputs
|
bool
|
Whether or not to add the input tensor to the output tensor. |
True
|
Source code in vllm/lora/ops/xla_ops/lora_ops.py
bgmv_jax
¶
bgmv_non_xla
¶
Source code in vllm/lora/ops/xla_ops/lora_ops.py
bgmv_shrink
¶
bgmv_shrink(
inputs: Tensor,
lora_b_weights: Tensor,
lora_indices_tensor: Tensor,
scaling: float = 1.0,
)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
inputs
|
Tensor
|
Input tensor of shape [num_tokens, hidden_size]. |
required |
lora_b_weights
|
Tensor
|
LoRA weights of shape [num_loras, lora_rank, hidden_size]. |
required |
output_tensor
|
Tensor
|
(Unused) output tensor (placeholder). |
required |
lora_indices_tensor
|
Tensor
|
Tensor of shape [num_tokens] indicating which LoRA matrix to use for each token. |
required |
scaling
|
float
|
Scalar multiplier applied to the output. |
1.0
|