-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Migrate QAT API; fix axolotl quantize for QAT-ed models; add NVFP4
#3107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 26 commits
Commits
Show all changes
79 commits
Select commit
Hold shift + click to select a range
d355954
migrating QAT API
SalmanMohammadi dd718b8
updating tests
SalmanMohammadi 1003432
updating cli
SalmanMohammadi 9f9ef8c
updating cli
SalmanMohammadi 75ed197
Merge branch 'main' into qat_migration
SalmanMohammadi 6166dbf
adding quant config
SalmanMohammadi ed9ef69
updating APIs
SalmanMohammadi c12d131
Merge branch 'main' into qat_migration
SalmanMohammadi 563c200
linting
SalmanMohammadi de9b10f
Merge branch 'qat_migration' of github.com:axolotl-ai-cloud/axolotl i…
SalmanMohammadi 218aa40
fixing tests
SalmanMohammadi ddeba5a
updating ptqconfig
SalmanMohammadi 17b6051
updating quantization.py
SalmanMohammadi 8b3e550
linting
SalmanMohammadi 450b92f
bump ao
SalmanMohammadi d4f5f5a
bump ao
SalmanMohammadi 1cada45
bump ao
SalmanMohammadi b218c7c
bump ao
SalmanMohammadi 4345ae6
bump ao
SalmanMohammadi 72f7820
fix language
SalmanMohammadi f4b7c26
comments
SalmanMohammadi 38ac691
comments
SalmanMohammadi 900d4a1
adding nvfp4
SalmanMohammadi 5b1f478
updating tests
SalmanMohammadi 80fb7da
fix dtype
SalmanMohammadi 47eb791
fix dtype
SalmanMohammadi b0ccde1
updating tests
SalmanMohammadi 84f6889
fixing accelerator
SalmanMohammadi c3f4048
fixing config
SalmanMohammadi 029734b
adding support for push to hub in quantize
SalmanMohammadi 78668b5
linting
SalmanMohammadi 5c095c3
Merge branch 'main' into qat_migration
SalmanMohammadi e2f5dd5
updating nvfp4 config
SalmanMohammadi f6ec879
Merge branch 'qat_migration' of github.com:axolotl-ai-cloud/axolotl i…
SalmanMohammadi 5bc768a
disable safetensors for push to hub
SalmanMohammadi 154315f
force config on push to hub
SalmanMohammadi a5ecc05
log
SalmanMohammadi ae7d876
cli hub_model_id
SalmanMohammadi 225e1d8
cli hub_model_id
SalmanMohammadi fac195d
adding quant strs
SalmanMohammadi 5a49579
adding quant strs
SalmanMohammadi 33dc44c
adding quant strs
SalmanMohammadi 9932b4f
fix quant_type kwarg
SalmanMohammadi 4a07a17
tkps
SalmanMohammadi ca1a0b7
updating conf
SalmanMohammadi 94554a1
linting
SalmanMohammadi a1a3d14
Merge branch 'main' of github.com:axolotl-ai-cloud/axolotl into qat_m…
SalmanMohammadi d8a8c75
dont need to specify model config
SalmanMohammadi a0ff954
comments
SalmanMohammadi 9a288c7
adding more aliases
SalmanMohammadi 0ae60e1
fixing fbgemm import [skip-e2e]
SalmanMohammadi bfea773
updating gpu runner step
SalmanMohammadi c7bb62d
updating install command
SalmanMohammadi 8da2c6b
disable default include_tkps
SalmanMohammadi 766677c
linting
SalmanMohammadi c17ca49
trying extras
SalmanMohammadi 3945aaa
2.8 only
SalmanMohammadi 4392627
guard int4weightonly import
SalmanMohammadi 8558ec9
guard int4weightonly import
SalmanMohammadi 1ffac1f
only import on 2.8
SalmanMohammadi e62e637
stray comma
SalmanMohammadi 6a8ed51
only attempt install on 2.8
SalmanMohammadi 54bbc30
Merge branch 'main' into qat_migration
SalmanMohammadi 43f7eb1
fix tests
SalmanMohammadi d223aeb
fix test case
SalmanMohammadi 13b51f9
fixing tests for b200s
SalmanMohammadi 2311bc5
comments
SalmanMohammadi ef53534
fixing tests
SalmanMohammadi 835d030
Merge branch 'main' into qat_migration
SalmanMohammadi 0e38530
fix test
SalmanMohammadi 3faf7bd
Merge branch 'qat_migration' of github.com:axolotl-ai-cloud/axolotl i…
SalmanMohammadi f35c2f9
Merge branch 'main' into qat_migration
SalmanMohammadi 0986ff0
fix group size defaults
SalmanMohammadi c284726
Merge branch 'qat_migration' of github.com:axolotl-ai-cloud/axolotl i…
SalmanMohammadi 320a722
comments
SalmanMohammadi 303439e
tests
SalmanMohammadi 6e455bd
Merge branch 'main' into qat_migration
SalmanMohammadi 23f0895
removing int4fp8 case
SalmanMohammadi c4f8e26
lint
SalmanMohammadi File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,73 @@ | ||
| base_model: meta-llama/Llama-3.2-3B | ||
| # Automatically upload checkpoint and final model to HF | ||
| # hub_model_id: username/custom_model_name | ||
|
|
||
| load_in_8bit: false | ||
| load_in_4bit: false | ||
| strict: false | ||
|
|
||
| datasets: | ||
| - path: yahma/alpaca-cleaned | ||
| type: alpaca | ||
| split: train[:95%] | ||
|
|
||
| output_dir: ./outputs/qat_out/ | ||
| dataset_prepared_path: ./outputs/qat_out/dataset_prepared | ||
|
|
||
| sample_packing: true | ||
| sequence_len: 8192 | ||
|
|
||
| flex_attention: true | ||
| flex_attn_compile_kwargs: | ||
| dynamic: false | ||
| mode: max-autotune-no-cudagraphs | ||
|
|
||
| qat: | ||
| activation_dtype: nvfp4 | ||
| weight_dtype: nvfp4 | ||
| group_size: 16 # only group_size of 16 is supported with nvfp4 | ||
|
|
||
| wandb_project: | ||
| wandb_entity: | ||
| wandb_watch: | ||
| wandb_name: | ||
| wandb_log_model: | ||
|
|
||
| gradient_accumulation_steps: 1 | ||
| micro_batch_size: 16 | ||
| num_epochs: 1 | ||
| optimizer: adamw_torch_fused | ||
|
|
||
| cosine_constant_lr_ratio: 0 | ||
| cosine_min_lr_ratio: 1.0 | ||
| learning_rate: 2e-5 | ||
| save_only_model: true | ||
| bf16: true | ||
|
|
||
| resume_from_checkpoint: | ||
| logging_steps: 1 | ||
|
|
||
| evals_per_epoch: 1 | ||
| saves_per_epoch: 1 | ||
|
|
||
| warmup_ratio: 0.1 | ||
| weight_decay: 0.0 | ||
| fsdp: | ||
| - full_shard | ||
| - auto_wrap | ||
|
|
||
| fsdp_config: | ||
| fsdp_version: 2 | ||
| fsdp_offload_params: false | ||
| fsdp_cpu_ram_efficient_loading: false | ||
| fsdp_auto_wrap_policy: TRANSFORMER_BASED_WRAP | ||
| fsdp_transformer_layer_cls_to_wrap: LlamaDecoderLayer | ||
| fsdp_state_dict_type: FULL_STATE_DICT | ||
| fsdp_sharding_strategy: FULL_SHARD | ||
| fsdp_reshard_after_forward: true | ||
| fsdp_activation_checkpointing: true | ||
|
|
||
| special_tokens: | ||
| pad_token: <|finetune_right_pad_id|> | ||
|
|
||
| # save_first_step: true # uncomment this to validate checkpoint saving works with your config |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.