Commit d8c58fb
Fix Mistral, Qwen (#1565)
* use exact model name
* Update save.py
* Update _utils.py
* Update _utils.py
* Update _utils.py
* Update _utils.py
* print
* Update _utils.py
* Update _utils.py
* Update llama.py
* Update _utils.py
* Update vision.py
* Update _utils.py
* Update _utils.py
* Update _utils.py
* Update _utils.py
* Update _utils.py
* Update _utils.py
* Update _utils.py
* Update _utils.py
* Update loader.py
* accurate_accumulation
* Update loader.py
* Update loader.py
* Update _utils.py
* Update loader.py
* Update loader.py
* Update loader.py
* Update loader.py
* Update pyproject.toml
* Update __init__.py
* Update pyproject.toml
* Update __init__.py
* Update __init__.py
* Fix Triton heuristics
triton-lang/triton#5224
* Update __init__.py
* Update __init__.py
* Update __init__.py
* Update __init__.py
* Xformers
* Update loader.py
* Update loader.py
* Rewind
* Update _utils.py
* Update _utils.py
* requires grad
* Update loader.py
* Update _utils.py
* Update loader.py
* changing model to base_model if peft model is already used
* Improve debugging experience (#1512)
* Create CONTRIBUTING.md (#1472)
Creating contributing guidelines
* Update CONTRIBUTING.md
improved sentence
* Improve logging control in `unsloth_compile_transformers` by conditionally redirecting stdout based on UNSLOTH_DISABLE_LOGGER environment variable
---------
Co-authored-by: Michael Han <[email protected]>
Co-authored-by: Nino Risteski <[email protected]>
* Update loader.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit b7ddf96.
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Auto change is_bfloat16_supported
* Update llama.py
* Force data-type
* Update llama.py
* All attention refactor fix (#1491)
* change initilization of n_heads, n_kv_heads, hidden_size in llama.py
* do the same for cohere, mistral, gemma2, granite
* do the same for flexattention,cohere, mistral, granite
* Update llama.py
* Update llama.py
* Update granite to work with latest post_patch methods (#1502)
* Update granite to work with latest post_patch methods
* Pass position_embeddings for granite even if transformers<4.47
* Update llama.py
---------
Co-authored-by: Daniel Han <[email protected]>
* Minor fixes for granite models (#1503)
* Update granite.py
Grab residual multiplier directly from layer
* Update llama.py
Version should read >= 4.47.1 as that is the version requiring the changes
* Update granite.py
* Update llama.py
---------
Co-authored-by: Daniel Han <[email protected]>
* support modelscope models and datasets (#1481)
* support modelscope
* change modelscope args
* remove useless import
* remove useless import
* fix
* wip
* fix
* remove useless code
* add readme
* add some comments
* change print to raise error
* update comment
* Update loader.py
---------
Co-authored-by: Daniel Han <[email protected]>
* Merge branch 'main' into nightly
* Phi 4
* Update llama.py
* Torch.Cuda Is Available Condition and Warning (#1545)
* check for torch.cuda and triton if available
on my machine(mac m3) the cuda were not available
* Update pyproject.toml
* Update __init__.py
---------
Co-authored-by: Daniel Han <[email protected]>
* Update mistral.py
* Update mistral.py
* Update _utils.py
* Update _utils.py
* Update _utils.py
* Update _utils.py
* Update _utils.py
* Fix
* Bug fixes
* Update mapper.py
* Add dropout to granite to match HF's implementation (#1557)
Signed-off-by: datta0 <[email protected]>
* Update llama.py
* Update llama.py
* Bug fixes
* fix: flash_attn_detection_error (#1556)
* fix: flash_attn_detection_error
* Update _utils.py
---------
Co-authored-by: Daniel Han <[email protected]>
---------
Signed-off-by: datta0 <[email protected]>
Co-authored-by: Itsuro Tajima <[email protected]>
Co-authored-by: Muhammad Osama <[email protected]>
Co-authored-by: Edd <[email protected]>
Co-authored-by: Michael Han <[email protected]>
Co-authored-by: Nino Risteski <[email protected]>
Co-authored-by: Kareem <[email protected]>
Co-authored-by: Datta Nimmaturi <[email protected]>
Co-authored-by: Z <[email protected]>
Co-authored-by: tastelikefeet <[email protected]>
Co-authored-by: AminWhat <[email protected]>
Co-authored-by: Zhe Zhang <[email protected]>1 parent d6982c1 commit d8c58fb
File tree
7 files changed
+29
-16
lines changed- unsloth
- models
7 files changed
+29
-16
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
39 | 39 | | |
40 | 40 | | |
41 | 41 | | |
42 | | - | |
| 42 | + | |
43 | 43 | | |
44 | 44 | | |
45 | 45 | | |
| |||
285 | 285 | | |
286 | 286 | | |
287 | 287 | | |
288 | | - | |
| 288 | + | |
289 | 289 | | |
290 | 290 | | |
291 | 291 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
86 | 86 | | |
87 | 87 | | |
88 | 88 | | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
89 | 93 | | |
90 | 94 | | |
91 | 95 | | |
| |||
194 | 198 | | |
195 | 199 | | |
196 | 200 | | |
197 | | - | |
| 201 | + | |
198 | 202 | | |
199 | 203 | | |
200 | 204 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
12 | 12 | | |
13 | 13 | | |
14 | 14 | | |
15 | | - | |
| 15 | + | |
16 | 16 | | |
17 | 17 | | |
18 | 18 | | |
| |||
285 | 285 | | |
286 | 286 | | |
287 | 287 | | |
288 | | - | |
| 288 | + | |
| 289 | + | |
| 290 | + | |
| 291 | + | |
| 292 | + | |
289 | 293 | | |
290 | 294 | | |
291 | 295 | | |
| |||
843 | 847 | | |
844 | 848 | | |
845 | 849 | | |
846 | | - | |
| 850 | + | |
| 851 | + | |
| 852 | + | |
847 | 853 | | |
848 | 854 | | |
849 | 855 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
89 | 89 | | |
90 | 90 | | |
91 | 91 | | |
| 92 | + | |
92 | 93 | | |
93 | 94 | | |
94 | 95 | | |
| |||
135 | 136 | | |
136 | 137 | | |
137 | 138 | | |
138 | | - | |
| 139 | + | |
139 | 140 | | |
140 | 141 | | |
141 | 142 | | |
142 | 143 | | |
143 | 144 | | |
144 | 145 | | |
145 | 146 | | |
146 | | - | |
| 147 | + | |
147 | 148 | | |
148 | 149 | | |
149 | 150 | | |
| |||
157 | 158 | | |
158 | 159 | | |
159 | 160 | | |
160 | | - | |
| 161 | + | |
161 | 162 | | |
162 | 163 | | |
163 | 164 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
636 | 636 | | |
637 | 637 | | |
638 | 638 | | |
| 639 | + | |
639 | 640 | | |
640 | 641 | | |
641 | 642 | | |
| |||
664 | 665 | | |
665 | 666 | | |
666 | 667 | | |
667 | | - | |
| 668 | + | |
668 | 669 | | |
669 | 670 | | |
670 | 671 | | |
| |||
792 | 793 | | |
793 | 794 | | |
794 | 795 | | |
795 | | - | |
| 796 | + | |
796 | 797 | | |
797 | 798 | | |
| 799 | + | |
| 800 | + | |
| 801 | + | |
798 | 802 | | |
799 | 803 | | |
800 | 804 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
471 | 471 | | |
472 | 472 | | |
473 | 473 | | |
474 | | - | |
| 474 | + | |
475 | 475 | | |
476 | 476 | | |
477 | | - | |
478 | 477 | | |
479 | 478 | | |
480 | 479 | | |
481 | 480 | | |
482 | 481 | | |
483 | 482 | | |
484 | | - | |
| 483 | + | |
485 | 484 | | |
486 | 485 | | |
487 | | - | |
488 | 486 | | |
489 | 487 | | |
490 | 488 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
304 | 304 | | |
305 | 305 | | |
306 | 306 | | |
307 | | - | |
| 307 | + | |
308 | 308 | | |
309 | 309 | | |
310 | 310 | | |
| |||
0 commit comments