vllm.lora.ops.torch_ops.lora_ops
bgmv_expand
¶
bgmv_expand(
inputs: Tensor,
lora_b_weights: Tensor,
output_tensor: Tensor,
lora_indices_tensor: Tensor,
add_inputs: bool = True,
)
Source code in vllm/lora/ops/torch_ops/lora_ops.py
bgmv_expand_slice
¶
bgmv_expand_slice(
inputs: Tensor,
lora_b_weights: Tensor,
output_tensor: Tensor,
lora_indices_tensor: Tensor,
slice_offset: int,
slice_size: int,
add_inputs: bool = True,
)
Source code in vllm/lora/ops/torch_ops/lora_ops.py
bgmv_shrink
¶
bgmv_shrink(
inputs: Tensor,
lora_b_weights: Tensor,
output_tensor: Tensor,
lora_indices_tensor: Tensor,
scaling: float = 1.0,
)
Source code in vllm/lora/ops/torch_ops/lora_ops.py
sgmv_expand
¶
sgmv_expand(
inputs: Tensor,
lora_b_weights: Tensor,
output_tensor: Tensor,
b_seq_start_loc: Tensor,
seq_len_tensor: Tensor,
lora_indices_tensor: Tensor,
batches: int,
max_seq_length: int,
token_nums: int,
add_inputs: bool = False,
)
Source code in vllm/lora/ops/torch_ops/lora_ops.py
sgmv_expand_slice
¶
sgmv_expand_slice(
inputs: Tensor,
lora_b_weights: Tensor,
output_tensor: Tensor,
b_seq_start_loc: Tensor,
seq_len_tensor: Tensor,
lora_indices_tensor: Tensor,
batches: int,
max_seq_length: int,
token_nums: int,
slice_offset: int,
slice_size: int,
add_inputs: bool = False,
)
Source code in vllm/lora/ops/torch_ops/lora_ops.py
sgmv_shrink
¶
sgmv_shrink(
inputs: Tensor,
lora_a_weights: Tensor,
output_tensor: Tensor,
b_seq_start_loc: Tensor,
seq_len_tensor: Tensor,
lora_indices_tensor: Tensor,
batches: int,
max_seq_length: int,
token_nums: int,
scaling: float,
)