vllm.model_executor.layers.quantization.kernels.scaled_mm.ScaledMMLinearKernel
ScaledMMLinearKernel
¶
Bases: ABC
Source code in vllm/model_executor/layers/quantization/kernels/scaled_mm/ScaledMMLinearKernel.py
__init__
¶
__init__(
c: ScaledMMLinearLayerConfig,
w_q_param_name: str,
w_s_param_name: str,
i_s_param_name: str,
i_zp_param_name: str,
azp_adj_param_name: str,
) -> None
Source code in vllm/model_executor/layers/quantization/kernels/scaled_mm/ScaledMMLinearKernel.py
_get_weight_params
¶
_get_weight_params(
layer: Module,
) -> tuple[
Tensor,
Tensor,
Optional[Tensor],
Optional[Tensor],
Optional[Tensor],
]
Source code in vllm/model_executor/layers/quantization/kernels/scaled_mm/ScaledMMLinearKernel.py
apply_weights
abstractmethod
¶
can_implement
abstractmethod
classmethod
¶
can_implement(
c: ScaledMMLinearLayerConfig,
) -> tuple[bool, Optional[str]]