Commit 189a103
Moe 128 rebased (#8)
* 128 experts
* Use default rope
* Unfuse mlp
* Address feedback
* Use None "default" for rope_scaling. Add eot.
* Meta/llama quant compat (#7)
* add quant compatible model & conversion code for llama4
* fix a few issues
* fix a few issues
* minor type mapping fix
---------
Co-authored-by: Lu Fang <[email protected]>
* use a new config parameter to determine which model definition to use for MoE
---------
Co-authored-by: Pedro Cuenca <[email protected]>
Co-authored-by: Lu Fang <[email protected]>1 parent fb748af commit 189a103
File tree
4 files changed
+48
-14
lines changed- src/transformers
- models/llama4
4 files changed
+48
-14
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
529 | 529 | | |
530 | 530 | | |
531 | 531 | | |
| 532 | + | |
532 | 533 | | |
533 | 534 | | |
534 | 535 | | |
| |||
4061 | 4062 | | |
4062 | 4063 | | |
4063 | 4064 | | |
4064 | | - | |
| 4065 | + | |
4065 | 4066 | | |
4066 | 4067 | | |
4067 | 4068 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
177 | 177 | | |
178 | 178 | | |
179 | 179 | | |
| 180 | + | |
180 | 181 | | |
181 | 182 | | |
182 | 183 | | |
| |||
217 | 218 | | |
218 | 219 | | |
219 | 220 | | |
| 221 | + | |
| 222 | + | |
220 | 223 | | |
221 | 224 | | |
222 | 225 | | |
| |||
290 | 293 | | |
291 | 294 | | |
292 | 295 | | |
| 296 | + | |
| 297 | + | |
| 298 | + | |
293 | 299 | | |
294 | 300 | | |
295 | 301 | | |
| |||
Lines changed: 24 additions & 8 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
21 | 21 | | |
22 | 22 | | |
23 | 23 | | |
| 24 | + | |
| 25 | + | |
24 | 26 | | |
25 | 27 | | |
26 | 28 | | |
| |||
29 | 31 | | |
30 | 32 | | |
31 | 33 | | |
| 34 | + | |
| 35 | + | |
32 | 36 | | |
33 | 37 | | |
34 | 38 | | |
| |||
44 | 48 | | |
45 | 49 | | |
46 | 50 | | |
47 | | - | |
48 | | - | |
49 | | - | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
50 | 54 | | |
51 | 55 | | |
52 | 56 | | |
| |||
262 | 266 | | |
263 | 267 | | |
264 | 268 | | |
| 269 | + | |
265 | 270 | | |
266 | 271 | | |
267 | 272 | | |
| |||
380 | 385 | | |
381 | 386 | | |
382 | 387 | | |
| 388 | + | |
| 389 | + | |
| 390 | + | |
| 391 | + | |
| 392 | + | |
| 393 | + | |
| 394 | + | |
| 395 | + | |
| 396 | + | |
| 397 | + | |
383 | 398 | | |
384 | 399 | | |
385 | 400 | | |
| |||
408 | 423 | | |
409 | 424 | | |
410 | 425 | | |
| 426 | + | |
411 | 427 | | |
412 | 428 | | |
413 | 429 | | |
| |||
710 | 726 | | |
711 | 727 | | |
712 | 728 | | |
713 | | - | |
714 | | - | |
715 | | - | |
716 | | - | |
717 | | - | |
| 729 | + | |
| 730 | + | |
| 731 | + | |
| 732 | + | |
| 733 | + | |
718 | 734 | | |
719 | 735 | | |
720 | 736 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
23 | 23 | | |
24 | 24 | | |
25 | 25 | | |
| 26 | + | |
26 | 27 | | |
27 | 28 | | |
28 | 29 | | |
| |||
61 | 62 | | |
62 | 63 | | |
63 | 64 | | |
64 | | - | |
65 | 65 | | |
66 | 66 | | |
67 | 67 | | |
| |||
153 | 153 | | |
154 | 154 | | |
155 | 155 | | |
156 | | - | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
157 | 162 | | |
158 | 163 | | |
159 | 164 | | |
| |||
184 | 189 | | |
185 | 190 | | |
186 | 191 | | |
187 | | - | |
188 | | - | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
189 | 200 | | |
190 | 201 | | |
191 | 202 | | |
| |||
1706 | 1717 | | |
1707 | 1718 | | |
1708 | 1719 | | |
1709 | | - | |
| 1720 | + | |
1710 | 1721 | | |
1711 | 1722 | | |
1712 | 1723 | | |
| |||
0 commit comments