vllm.model_executor.layers.quantization.turboquant.quantizer ¶
TurboQuant quantizer utilities.
Triton kernels handle all quantization, packing, and dequantization on GPU.
vllm.model_executor.layers.quantization.turboquant.quantizer ¶TurboQuant quantizer utilities.
Triton kernels handle all quantization, packing, and dequantization on GPU.