vllm.utils.serial_utils ¶
EMBED_DTYPES module-attribute ¶
EMBED_DTYPES: Mapping[EmbedDType, DTypeInfo] = {
"float32": DTypeInfo(float32, float32, float32),
"float16": DTypeInfo(float16, float16, float16),
"bfloat16": DTypeInfo(bfloat16, float16, float16),
"fp8_e4m3": DTypeInfo(float8_e4m3fn, uint8, uint8),
"fp8_e5m2": DTypeInfo(float8_e5m2, uint8, uint8),
}
EmbedDType module-attribute ¶
EmbedDType = Literal[
"float32", "float16", "bfloat16", "fp8_e4m3", "fp8_e5m2"
]
EncodingFormat module-attribute ¶
EncodingFormat = Literal[
"float", "base64", "bytes", "bytes_only"
]
DTypeInfo dataclass ¶
Source code in vllm/utils/serial_utils.py
binary2tensor ¶
binary2tensor(
binary: bytes,
shape: tuple[int, ...],
embed_dtype: EmbedDType,
endianness: Endianness,
) -> Tensor
Source code in vllm/utils/serial_utils.py
tensor2base64 ¶
tensor2binary ¶
tensor2binary(
tensor: Tensor,
embed_dtype: EmbedDType,
endianness: Endianness,
) -> bytes