vllm.lora.fully_sharded_layers
ColumnParallelLinearWithShardedLoRA
¶
Bases: ColumnParallelLinearWithLoRA
Differs from ColumnParallelLinearWithLoRA by slicing LoRA A also.
Based on S-LoRA, slicing happens along the rank dim.
Source code in vllm/lora/fully_sharded_layers.py
apply
¶
can_replace_layer
classmethod
¶
can_replace_layer(
source_layer: Module,
lora_config: LoRAConfig,
packed_modules_list: list,
model_config: Optional[PretrainedConfig],
) -> bool
Source code in vllm/lora/fully_sharded_layers.py
slice_lora_a
¶
Source code in vllm/lora/fully_sharded_layers.py
MergedColumnParallelLinearWithShardedLoRA
¶
Bases: MergedColumnParallelLinearWithLoRA
Differs from MergedColumnParallelLinearWithLoRA by slicing the LoRA A's also.
Based on S-LoRA, slicing happens along the rank dim.
Source code in vllm/lora/fully_sharded_layers.py
apply
¶
can_replace_layer
classmethod
¶
can_replace_layer(
source_layer: Module,
lora_config: LoRAConfig,
packed_modules_list: list,
model_config: Optional[PretrainedConfig],
) -> bool
Source code in vllm/lora/fully_sharded_layers.py
slice_lora_a
¶
Source code in vllm/lora/fully_sharded_layers.py
MergedQKVParallelLinearWithShardedLoRA
¶
Bases: MergedQKVParallelLinearWithLoRA
Differs from MergedQKVParallelLinearWithLoRA by slicing the LoRA A's also.
Based on S-LoRA, slicing happens along the rank dim.
Source code in vllm/lora/fully_sharded_layers.py
apply
¶
can_replace_layer
classmethod
¶
can_replace_layer(
source_layer: Module,
lora_config: LoRAConfig,
packed_modules_list: list,
model_config: Optional[PretrainedConfig],
) -> bool
Source code in vllm/lora/fully_sharded_layers.py
slice_lora_a
¶
Source code in vllm/lora/fully_sharded_layers.py
QKVParallelLinearWithShardedLoRA
¶
Bases: QKVParallelLinearWithLoRA
Differs from QKVParallelLinearWithLoRA by slicing the LoRA A's also.
Based on S-LoRA, slicing happens along the rank dim.
Source code in vllm/lora/fully_sharded_layers.py
apply
¶
can_replace_layer
classmethod
¶
can_replace_layer(
source_layer: Module,
lora_config: LoRAConfig,
packed_modules_list: list,
model_config: Optional[PretrainedConfig],
) -> bool
Source code in vllm/lora/fully_sharded_layers.py
slice_lora_a
¶
Source code in vllm/lora/fully_sharded_layers.py
RowParallelLinearWithShardedLoRA
¶
Bases: RowParallelLinearWithLoRA
Differs from RowParallelLinearWithLoRA by slicing the LoRA B's also.
Based on S-LoRA, slicing happens along the output dim. This yields a combined partial sum from the row parallel base layer and column partitioned output from the LoRA.
Source code in vllm/lora/fully_sharded_layers.py
265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 |
|
apply
¶
Source code in vllm/lora/fully_sharded_layers.py
can_replace_layer
classmethod
¶
can_replace_layer(
source_layer: Module,
lora_config: LoRAConfig,
packed_modules_list: list,
model_config: Optional[PretrainedConfig],
) -> bool
Source code in vllm/lora/fully_sharded_layers.py
slice_bias
¶
Source code in vllm/lora/fully_sharded_layers.py
slice_lora_b
¶
Source code in vllm/lora/fully_sharded_layers.py
_fully_sharded_can_replace
¶
decorator which adds the condition of fully sharded loras intended to wrap can_replace_layer()
Source code in vllm/lora/fully_sharded_layers.py
_mcp_apply
¶
_mcp_apply(x, bias, layer: ColumnParallelLinearWithLoRA)
For ColumnParallelLinearWithLoRA
or classes that inherit from
ColumnParallelLinearWithLoRA
, they share the same apply
logic.