You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[Test Fix] Add Quantization then finetune tests (#964)
~~Contingent on merge of
huggingface/transformers#34719
^ has been merged not yet released
SUMMARY:
Add test to
* Given a model, oneshot quantize, then run ptq - training.
Model must be run_compressed = False to run
Note:
* When running finetune on an already optimized (one-shotted) mode, the
model needs to be decompressed explicitly using
`CompressedTensorsConfig`. See
https://github.com/vllm-project/llm-compressor/pull/964/files#diff-e480ed475c0a5b2beb4052c1dd2aca671999634ace41a5ea017fdff1ce68be0bR130-R135
* Tests using x2 H100s passed
Also fix a bug where in log_sparsification, the layer name is not being
recognized so fails. Here nothting is being sparsified, so num params is
set to zero
TEST PLAN:
ran the test using transformers main
must pass
tests/llmcompressor/transformers/finetune/test_oneshot_then_finetune.py
---------
Co-authored-by: Dipika Sikka <[email protected]>
0 commit comments