Commit ff43bc2
Tdt buffered inference fix (NVIDIA-NeMo#13500)
* added use-fast tokenizer argument (NVIDIA-NeMo#12986)
Signed-off-by: Francesco Bertolotti <[email protected]>
Co-authored-by: Alexandros Koumparoulis <[email protected]>
Co-authored-by: oliver könig <[email protected]>
Signed-off-by: Hainan Xu <[email protected]>
* ci: Run selective triggering on dockerfiles and dependencies (NVIDIA-NeMo#13493)
Signed-off-by: oliver könig <[email protected]>
Signed-off-by: Hainan Xu <[email protected]>
* fix buffered inference for tdt
Signed-off-by: Hainan Xu <[email protected]>
* small fixes
Signed-off-by: Hainan Xu <[email protected]>
* [automodel] fallback FP8 + LCE -> FP8 + CE (NVIDIA-NeMo#13349)
* fix
Signed-off-by: Alexandros Koumparoulis <[email protected]>
* make fp8 tests non-optional
Signed-off-by: Alexandros Koumparoulis <[email protected]>
* switch to gemma
Signed-off-by: Alexandros Koumparoulis <[email protected]>
---------
Signed-off-by: Alexandros Koumparoulis <[email protected]>
Co-authored-by: oliver könig <[email protected]>
Signed-off-by: Hainan Xu <[email protected]>
* Update changelog for `r2.3.0` (NVIDIA-NeMo#13501)
* beep boop: Update changelog
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* Add changelog highlights
Signed-off-by: Charlie Truong <[email protected]>
---------
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Signed-off-by: Charlie Truong <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Charlie Truong <[email protected]>
Signed-off-by: Hainan Xu <[email protected]>
* Update 2.3.0 changelog (NVIDIA-NeMo#13503)
Signed-off-by: Charlie Truong <[email protected]>
Signed-off-by: Hainan Xu <[email protected]>
* ci: Remove trt-llm breakpoint (NVIDIA-NeMo#13499)
* tests: Disable flaky test
Signed-off-by: oliver könig <[email protected]>
* remove breakpoint
Signed-off-by: oliver könig <[email protected]>
---------
Signed-off-by: oliver könig <[email protected]>
Signed-off-by: Hainan Xu <[email protected]>
* Update 2.3.0 changelog (NVIDIA-NeMo#13504)
* Fix 2.3.0 changelog
Signed-off-by: Charlie Truong <[email protected]>
* Update 2.3.0 changelog
Signed-off-by: Charlie Truong <[email protected]>
---------
Signed-off-by: Charlie Truong <[email protected]>
Signed-off-by: Hainan Xu <[email protected]>
* Enabling flash decode for float16 precision only (NVIDIA-NeMo#13471)
Signed-off-by: Pranav Prashant Thombre <[email protected]>
Signed-off-by: Hainan Xu <[email protected]>
* Fix changelog formatting (NVIDIA-NeMo#13505)
Signed-off-by: Charlie Truong <[email protected]>
Signed-off-by: Hainan Xu <[email protected]>
* Updating the long context performance number for B200 (NVIDIA-NeMo#13468)
* Add without CP numbers for B200 and merge the captioning texts of both into one.
Signed-off-by: Youngeun Kwon <[email protected]>
* figure removed
Signed-off-by: Youngeun Kwon <[email protected]>
---------
Signed-off-by: Youngeun Kwon <[email protected]>
Signed-off-by: Hainan Xu <[email protected]>
* Autodetect model_type and dtype for deployment using TRT-LLM backend (NVIDIA-NeMo#13209)
* Autodetect model_type and dtype for deployment using TRT-LLM backed
Signed-off-by: Jan Lasek <[email protected]>
* Handling kv_cache_qformat parameter
Signed-off-by: Jan Lasek <[email protected]>
* Apply isort and black reformatting
Signed-off-by: janekl <[email protected]>
* Docstring update
Signed-off-by: Jan Lasek <[email protected]>
---------
Signed-off-by: Jan Lasek <[email protected]>
Signed-off-by: janekl <[email protected]>
Co-authored-by: janekl <[email protected]>
Signed-off-by: Hainan Xu <[email protected]>
* remove unused variable
Signed-off-by: Hainan Xu <[email protected]>
* Apply isort and black reformatting
Signed-off-by: hainan-xv <[email protected]>
Signed-off-by: Hainan Xu <[email protected]>
* add doc string, cleaner way of setting mergo_algo
Signed-off-by: Hainan Xu <[email protected]>
* Apply isort and black reformatting
Signed-off-by: hainan-xv <[email protected]>
Signed-off-by: Hainan Xu <[email protected]>
* add extra hyena tests (NVIDIA-NeMo#13097)
* add extra hyena tests
* Apply isort and black reformatting
Signed-off-by: JRD971000 <[email protected]>
* fix num gpus
* keep sft optional
---------
Signed-off-by: JRD971000 <[email protected]>
Co-authored-by: JRD971000 <[email protected]>
Signed-off-by: Hainan Xu <[email protected]>
* ci: Add mode files to filter (NVIDIA-NeMo#13517)
Signed-off-by: oliver könig <[email protected]>
Signed-off-by: Hainan Xu <[email protected]>
* change default merge_algo for buffered inference to None
Signed-off-by: Hainan Xu <[email protected]>
* Apply isort and black reformatting
Signed-off-by: hainan-xv <[email protected]>
---------
Signed-off-by: Francesco Bertolotti <[email protected]>
Signed-off-by: Hainan Xu <[email protected]>
Signed-off-by: oliver könig <[email protected]>
Signed-off-by: Alexandros Koumparoulis <[email protected]>
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Signed-off-by: Charlie Truong <[email protected]>
Signed-off-by: Pranav Prashant Thombre <[email protected]>
Signed-off-by: Youngeun Kwon <[email protected]>
Signed-off-by: Jan Lasek <[email protected]>
Signed-off-by: janekl <[email protected]>
Signed-off-by: hainan-xv <[email protected]>
Signed-off-by: JRD971000 <[email protected]>
Co-authored-by: Francesco Bertolotti <[email protected]>
Co-authored-by: Alexandros Koumparoulis <[email protected]>
Co-authored-by: oliver könig <[email protected]>
Co-authored-by: Hainan Xu <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Charlie Truong <[email protected]>
Co-authored-by: pthombre <[email protected]>
Co-authored-by: Youngeun Kwon <[email protected]>
Co-authored-by: Jan Lasek <[email protected]>
Co-authored-by: janekl <[email protected]>
Co-authored-by: hainan-xv <[email protected]>
Co-authored-by: Ali Taghibakhshi <[email protected]>
Co-authored-by: JRD971000 <[email protected]>
Signed-off-by: jianbinc <[email protected]>1 parent cb40770 commit ff43bc2
File tree
3 files changed
+135
-4
lines changed- examples/asr/asr_chunked_inference/rnnt
- nemo/collections/asr/parts
- submodules
- utils
3 files changed
+135
-4
lines changedLines changed: 29 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
23 | 23 | | |
24 | 24 | | |
25 | 25 | | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
26 | 29 | | |
27 | 30 | | |
28 | 31 | | |
| |||
73 | 76 | | |
74 | 77 | | |
75 | 78 | | |
| 79 | + | |
76 | 80 | | |
77 | 81 | | |
78 | 82 | | |
| |||
135 | 139 | | |
136 | 140 | | |
137 | 141 | | |
138 | | - | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
139 | 146 | | |
140 | 147 | | |
141 | 148 | | |
| |||
150 | 157 | | |
151 | 158 | | |
152 | 159 | | |
| 160 | + | |
| 161 | + | |
153 | 162 | | |
154 | 163 | | |
155 | 164 | | |
| |||
212 | 221 | | |
213 | 222 | | |
214 | 223 | | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
215 | 232 | | |
216 | 233 | | |
217 | | - | |
| 234 | + | |
218 | 235 | | |
219 | 236 | | |
220 | 237 | | |
| |||
267 | 284 | | |
268 | 285 | | |
269 | 286 | | |
| 287 | + | |
| 288 | + | |
| 289 | + | |
| 290 | + | |
| 291 | + | |
| 292 | + | |
| 293 | + | |
| 294 | + | |
| 295 | + | |
| 296 | + | |
270 | 297 | | |
271 | 298 | | |
272 | 299 | | |
| |||
Lines changed: 4 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2673 | 2673 | | |
2674 | 2674 | | |
2675 | 2675 | | |
2676 | | - | |
| 2676 | + | |
| 2677 | + | |
2677 | 2678 | | |
2678 | 2679 | | |
2679 | | - | |
| 2680 | + | |
| 2681 | + | |
2680 | 2682 | | |
2681 | 2683 | | |
2682 | 2684 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1264 | 1264 | | |
1265 | 1265 | | |
1266 | 1266 | | |
| 1267 | + | |
| 1268 | + | |
| 1269 | + | |
| 1270 | + | |
| 1271 | + | |
| 1272 | + | |
| 1273 | + | |
| 1274 | + | |
| 1275 | + | |
| 1276 | + | |
| 1277 | + | |
| 1278 | + | |
| 1279 | + | |
| 1280 | + | |
| 1281 | + | |
| 1282 | + | |
| 1283 | + | |
| 1284 | + | |
| 1285 | + | |
| 1286 | + | |
| 1287 | + | |
| 1288 | + | |
| 1289 | + | |
| 1290 | + | |
| 1291 | + | |
| 1292 | + | |
| 1293 | + | |
| 1294 | + | |
| 1295 | + | |
| 1296 | + | |
| 1297 | + | |
| 1298 | + | |
| 1299 | + | |
| 1300 | + | |
| 1301 | + | |
| 1302 | + | |
| 1303 | + | |
| 1304 | + | |
| 1305 | + | |
| 1306 | + | |
| 1307 | + | |
| 1308 | + | |
| 1309 | + | |
| 1310 | + | |
| 1311 | + | |
| 1312 | + | |
| 1313 | + | |
| 1314 | + | |
| 1315 | + | |
| 1316 | + | |
| 1317 | + | |
| 1318 | + | |
| 1319 | + | |
| 1320 | + | |
| 1321 | + | |
| 1322 | + | |
| 1323 | + | |
| 1324 | + | |
| 1325 | + | |
| 1326 | + | |
| 1327 | + | |
| 1328 | + | |
| 1329 | + | |
| 1330 | + | |
| 1331 | + | |
| 1332 | + | |
| 1333 | + | |
| 1334 | + | |
| 1335 | + | |
| 1336 | + | |
| 1337 | + | |
| 1338 | + | |
| 1339 | + | |
| 1340 | + | |
| 1341 | + | |
| 1342 | + | |
| 1343 | + | |
| 1344 | + | |
| 1345 | + | |
| 1346 | + | |
| 1347 | + | |
| 1348 | + | |
| 1349 | + | |
| 1350 | + | |
| 1351 | + | |
| 1352 | + | |
| 1353 | + | |
| 1354 | + | |
| 1355 | + | |
| 1356 | + | |
| 1357 | + | |
| 1358 | + | |
| 1359 | + | |
| 1360 | + | |
| 1361 | + | |
| 1362 | + | |
| 1363 | + | |
| 1364 | + | |
| 1365 | + | |
| 1366 | + | |
| 1367 | + | |
| 1368 | + | |
1267 | 1369 | | |
1268 | 1370 | | |
1269 | 1371 | | |
| |||
0 commit comments