update

ri938 · ri938 · commit 5bd5ed660c3a · 2023-08-14T20:03:31.000Z
diff --git a/vllm/model_executor/models/llama.py b/vllm/model_executor/models/llama.py
@@ -147,7 +147,7 @@ def get_quantized_layer(in_features, out_features, quant_config):
         in_features=in_features,
         out_features=out_features,
         bias=None,
-        dev=0  ## TODO: fix this
+        dev=0  ## TODO: fix this without large spike in memory
     )
     return layer
 

Original file line number	Diff line number	Diff line change
`@@ -147,7 +147,7 @@ def get_quantized_layer(in_features, out_features, quant_config):`
`147`	`147`	`in_features=in_features,`
`148`	`148`	`out_features=out_features,`
`149`	`149`	`bias=None,`
`150`		`- dev=0 ## TODO: fix this`
	`150`	`+ dev=0 ## TODO: fix this without large spike in memory`
`151`	`151`	`)`
`152`	`152`	`return layer`
`153`	`153`