vllm.model_executor.layers.quantization.compressed_tensors.utils
_find_first_match
¶
_find_first_match(
value: str,
targets: Iterable[str],
check_contains: bool = False,
) -> Optional[str]
Returns first element of target that matches value either exactly or as a regex after 're:'. If check_contains is set to True, additionally checks if the target string is contained within the value.
:param value: string to compare the list of targets against :param targets: list of targets to match the layer against :param check_contains: whether or not to do a substring match
Source code in vllm/model_executor/layers/quantization/compressed_tensors/utils.py
_is_equal_or_regex_match
¶
Checks whether a value is exactly equal or a regex match for target if target starts with 're:'. If check_contains is set to True, additionally checks if the target string is contained within the value.
Source code in vllm/model_executor/layers/quantization/compressed_tensors/utils.py
_match_fused_layer
¶
_match_fused_layer(
layer_name: str,
target_layers: Iterable[str],
fused_mapping: Mapping[str, list[str]],
) -> Optional[str]
Match a fused layer name to its corresponding individual layer in target_layers. Returns first value in fused_mapping which matches targets
Implements an "all" matching strategy where a fused layer matches iff "all" of its components match
:param layer_name: layer name :param target_layers: list of targets to match the layer against :param fused_mapping: map from fused layer names to its components
Examples:
layer_name = "model.layers.0.self_attn.qkv_proj" target_layers = ["model.layers.0.self_attn.q_proj", "model.layers.0.self_attn.k_proj", "model.layers.0.self_attn.v_proj"]
Source code in vllm/model_executor/layers/quantization/compressed_tensors/utils.py
check_equal_or_regex_match
¶
Checks whether a layer_name is exactly equal or a regex match for if target starts with 're:' to any target in list.
Source code in vllm/model_executor/layers/quantization/compressed_tensors/utils.py
find_matched_target
¶
find_matched_target(
layer_name: Optional[str],
module: Module,
targets: Iterable[str],
fused_mapping: Mapping[
str, list[str]
] = MappingProxyType({}),
) -> str
Helper function to look up which "target" in the compressed-tensors config that a layer corresponds to.
Recall that a compressed-tensors configs has a concept of config_groups, where each layer can be quantized with with a different scheme.
targets in each config_group will be a list of either layer names (or regexes corresponding to layer names) or names of torch Modules.
First, we try to match the layer_name with a target Second, we try to match the module's name with a target Third, we try to map the layer_name to a list of fused module names. All component module names must match in order for a match to be successful. A successful match returns the first component target
:param layer_name: layer name :param module: torch.nn.Module :param targets: list of targets to match the layer against :param fused_mapping: map from fused layer names to its components :param fused_strategy: either "all" or "any". If using "all", fused layers match if "all" of its components match
Source code in vllm/model_executor/layers/quantization/compressed_tensors/utils.py
is_activation_quantization_format
¶
Source code in vllm/model_executor/layers/quantization/compressed_tensors/utils.py
should_ignore_layer
¶
should_ignore_layer(
layer_name: Optional[str],
ignore: Iterable[str] = tuple(),
fused_mapping: Mapping[
str, list[str]
] = MappingProxyType({}),
) -> bool