vllm.model_executor.layers.quantization.utils.machete_utils
check_machete_supports_shape
¶
Source code in vllm/model_executor/layers/quantization/utils/machete_utils.py
query_machete_supported_act_types
¶
query_machete_supported_act_types(
zero_points: bool,
) -> list[ScalarType]
query_machete_supported_group_sizes
¶
Queries the supported group sizes for Machete based on the activation type.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
act_type
|
dtype
|
The activation data type (torch.float16, torch.bfloat16). |
required |
Returns:
Type | Description |
---|---|
list[int]
|
A list of supported group sizes. The group size must |
list[int]
|
be divisible by |
list[int]
|
-1 indicates per-channel quantization. |
Source code in vllm/model_executor/layers/quantization/utils/machete_utils.py
query_machete_supported_quant_types
¶
query_machete_supported_quant_types(
zero_points: bool,
) -> list[ScalarType]