vllm.model_executor.layers.quantization.quark.schemes.quark_scheme
QuarkScheme
¶
Bases: ABC
Abstract class used to describe the weight creation and forward pass of different quantization schemes supported by Quark.
Source code in vllm/model_executor/layers/quantization/quark/schemes/quark_scheme.py
apply_weights
abstractmethod
¶
Run the forward pass for the particular scheme. This is where scheme-specific dequant/quant steps/kernels should be applied.
:param layer: torch.nn.Module with the registered weights and other parameters relevant to the particular scheme. :param x: input to the layer :param bias: bias parameter
Source code in vllm/model_executor/layers/quantization/quark/schemes/quark_scheme.py
create_weights
abstractmethod
¶
Weight creation for the particular scheme. Inputs to this function