vllm.model_executor.layers.quantization.utils.marlin_utils_test
Utility functions used for tests and benchmarks
MarlinWorkspace
¶
Source code in vllm/model_executor/layers/quantization/utils/marlin_utils_test.py
__init__
¶
Source code in vllm/model_executor/layers/quantization/utils/marlin_utils_test.py
awq_marlin_quantize
¶
awq_marlin_quantize(
w: Tensor, quant_type: ScalarType, group_size: int
)
Source code in vllm/model_executor/layers/quantization/utils/marlin_utils_test.py
get_weight_perm
¶
get_weight_perm(num_bits: int)
Source code in vllm/model_executor/layers/quantization/utils/marlin_utils_test.py
marlin_permute_weights
¶
marlin_permute_weights(
q_w, size_k, size_n, perm, tile=GPTQ_MARLIN_TILE
)
Source code in vllm/model_executor/layers/quantization/utils/marlin_utils_test.py
marlin_quantize
¶
marlin_quantize(
w: Tensor,
quant_type: ScalarType,
group_size: int,
act_order: bool,
test_perm: Optional[Tensor] = None,
)