Commit 979b759
Update the routing for TRTLLMGEN to support kimi k2 and qwen (#1831)
<!-- .github/pull_request_template.md -->
## 📌 Description
Update the routing code to align with the implementation in TRTLLM and
add support for KIMI K2 and Qwen
Also revised the unit test based on the config of kimi k2
(https://huggingface.co/moonshotai/Kimi-K2-Instruct/blob/main/config.json)
## 🔍 Related Issues
## 🚀 Pull Request Checklist
Thank you for contributing to FlashInfer! Before we review your pull
request, please make sure the following items are complete.
### ✅ Pre-commit Checks
- [x] I have installed `pre-commit` by running `pip install pre-commit`
(or used your preferred method).
- [x] I have installed the hooks with `pre-commit install`.
- [x] I have run the hooks manually with `pre-commit run --all-files`
and fixed any reported issues.
> If you are unsure about how to set up `pre-commit`, see [the
pre-commit documentation](https://pre-commit.com/).
## 🧪 Tests
- [ ] Tests have been added or updated as needed.
- [ ] All tests are passing (`unittest`, etc.).
## Reviewer Notes
<!-- Optional: anything you'd like reviewers to focus on, concerns, etc.
-->
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
## Release Notes
* **New Features**
* MoE operations now support optional routing parameters with automatic
defaults for streamlined model configuration.
* **Refactor**
* Optimized expert kernel routing and buffer management for improved
flexibility across multiple routing strategies.
* Enhanced top-K result handling with unified buffer interfaces.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Signed-off-by: jiahanc <[email protected]>
Co-authored-by: jiahanc <[email protected]>1 parent ffcc5f4 commit 979b759
File tree
11 files changed
+1036
-711
lines changed- csrc
- flashinfer/fused_moe
- include/flashinfer/trtllm/fused_moe
- tests/moe
11 files changed
+1036
-711
lines changedLarge diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
70 | 70 | | |
71 | 71 | | |
72 | 72 | | |
73 | | - | |
| 73 | + | |
74 | 74 | | |
75 | 75 | | |
76 | 76 | | |
77 | 77 | | |
78 | | - | |
| 78 | + | |
79 | 79 | | |
80 | 80 | | |
81 | 81 | | |
| |||
107 | 107 | | |
108 | 108 | | |
109 | 109 | | |
110 | | - | |
| 110 | + | |
111 | 111 | | |
112 | 112 | | |
113 | 113 | | |
114 | 114 | | |
115 | | - | |
| 115 | + | |
116 | 116 | | |
117 | 117 | | |
118 | 118 | | |
| |||
149 | 149 | | |
150 | 150 | | |
151 | 151 | | |
152 | | - | |
| 152 | + | |
153 | 153 | | |
154 | 154 | | |
155 | 155 | | |
156 | 156 | | |
157 | | - | |
| 157 | + | |
158 | 158 | | |
159 | 159 | | |
160 | 160 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
184 | 184 | | |
185 | 185 | | |
186 | 186 | | |
187 | | - | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
188 | 190 | | |
189 | 191 | | |
190 | 192 | | |
| |||
198 | 200 | | |
199 | 201 | | |
200 | 202 | | |
201 | | - | |
| 203 | + | |
202 | 204 | | |
203 | 205 | | |
204 | | - | |
| 206 | + | |
205 | 207 | | |
206 | 208 | | |
207 | 209 | | |
| |||
211 | 213 | | |
212 | 214 | | |
213 | 215 | | |
214 | | - | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
215 | 219 | | |
216 | 220 | | |
217 | 221 | | |
| |||
223 | 227 | | |
224 | 228 | | |
225 | 229 | | |
226 | | - | |
227 | | - | |
| 230 | + | |
| 231 | + | |
228 | 232 | | |
229 | 233 | | |
230 | 234 | | |
| |||
1097 | 1101 | | |
1098 | 1102 | | |
1099 | 1103 | | |
1100 | | - | |
1101 | | - | |
| 1104 | + | |
| 1105 | + | |
1102 | 1106 | | |
1103 | 1107 | | |
1104 | 1108 | | |
1105 | | - | |
| 1109 | + | |
1106 | 1110 | | |
1107 | 1111 | | |
1108 | 1112 | | |
| |||
1151 | 1155 | | |
1152 | 1156 | | |
1153 | 1157 | | |
1154 | | - | |
1155 | | - | |
| 1158 | + | |
| 1159 | + | |
1156 | 1160 | | |
1157 | 1161 | | |
1158 | 1162 | | |
1159 | | - | |
| 1163 | + | |
1160 | 1164 | | |
1161 | 1165 | | |
1162 | 1166 | | |
| |||
1183 | 1187 | | |
1184 | 1188 | | |
1185 | 1189 | | |
1186 | | - | |
1187 | | - | |
| 1190 | + | |
| 1191 | + | |
1188 | 1192 | | |
1189 | 1193 | | |
1190 | 1194 | | |
1191 | | - | |
| 1195 | + | |
1192 | 1196 | | |
1193 | 1197 | | |
1194 | 1198 | | |
| |||
1197 | 1201 | | |
1198 | 1202 | | |
1199 | 1203 | | |
| 1204 | + | |
1200 | 1205 | | |
1201 | 1206 | | |
1202 | 1207 | | |
| |||
1238 | 1243 | | |
1239 | 1244 | | |
1240 | 1245 | | |
1241 | | - | |
1242 | | - | |
| 1246 | + | |
| 1247 | + | |
1243 | 1248 | | |
1244 | 1249 | | |
1245 | 1250 | | |
1246 | | - | |
| 1251 | + | |
1247 | 1252 | | |
1248 | 1253 | | |
1249 | 1254 | | |
| |||
1503 | 1508 | | |
1504 | 1509 | | |
1505 | 1510 | | |
1506 | | - | |
1507 | | - | |
| 1511 | + | |
| 1512 | + | |
1508 | 1513 | | |
1509 | 1514 | | |
1510 | 1515 | | |
1511 | | - | |
| 1516 | + | |
1512 | 1517 | | |
1513 | 1518 | | |
1514 | 1519 | | |
| |||
1576 | 1581 | | |
1577 | 1582 | | |
1578 | 1583 | | |
1579 | | - | |
1580 | | - | |
| 1584 | + | |
| 1585 | + | |
1581 | 1586 | | |
1582 | 1587 | | |
1583 | 1588 | | |
1584 | | - | |
| 1589 | + | |
1585 | 1590 | | |
1586 | 1591 | | |
1587 | 1592 | | |
| |||
0 commit comments