Skip to content

Commit 8aa7889

Browse files
committed
BUG FIX: HQQ quantization would error out if torch.dtype (dataType) was set to auto, it now force-sets to torch.bfloat16
1 parent 086b9d0 commit 8aa7889

File tree

1 file changed

+2
-1
lines changed

1 file changed

+2
-1
lines changed

web_app/hf_waitress.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -568,7 +568,8 @@ def initialize_model():
568568
quantization_config = QuantoConfig(weights="int4")
569569
model_params["quantization_config"] = quantization_config
570570
elif quantize == "hqq":
571-
print("HQQ-Quantizing")
571+
print("HQQ-Quantizing - Force-setting torch_dtype to torch.bfloat16")
572+
model_params["torch_dtype"] = torch.bfloat16
572573
quant_level = quant_level.lower().strip()
573574

574575
if quant_level == "int8":

0 commit comments

Comments
 (0)