Releases: NVIDIA-NeMo/NeMo
Releases · NVIDIA-NeMo/NeMo
NVIDIA Neural Modules 2.5.2
Detailed Changelogs:
Text Normalization / Inverse Text Normalization
Changelog
- cp: 
Add import guards for mcore lightning module(#14970) intor2.5.0by @chtruong814 :: PR: #14982 
Uncategorized:
Changelog
- Bump to 2.5.2 by @chtruong814 :: PR: #14983
 
NVIDIA Neural Modules 2.5.1
Highlights
- This release addresses known security issues. For the latest NVIDIA Vulnerability Disclosure Information visit https://www.nvidia.com/en-us/security/, for acknowledgement please reach out to the NVIDIA PSIRT team at [email protected]
 - Adds nv-one-logger
 - Adds fixes related to Megatron FSDP
 
Detailed Changelogs:
ASR
Changelog
- Patch: r2.5.0 with onelogger changes. by @PeiyuanQi :: PR: #14811
 
TTS
Changelog
- Patch: r2.5.0 with onelogger changes. by @PeiyuanQi :: PR: #14811
 
NLP / NMT
Changelog
- Patch: r2.5.0 with onelogger changes. by @PeiyuanQi :: PR: #14811
 - Megatron FSDP r2.5.0 cherry-pick by @BoxiangW :: PR: #14922
 
Uncategorized:
Changelog
- Bump to 2.5.1 by @chtruong814 :: PR: #14898
 - Cherry pick 
Feat: Disk space management: for nemo install test (14822)intor2.5.0by @chtruong814 :: PR: #14937 - cp: 
Fix the load checkpointing issue -- onelogger callback gets called multiple time in some case. (14945)intor2.5.0by @chtruong814 :: PR: #14948 
25.09-alpha.rc2
Update lora.py Signed-off-by: Michał Marcinkiewicz <[email protected]>
NVIDIA Neural Modules 2.5.0
Highlights
- 
Collections:
- LLM
- Nano v2 12B and 9B
 
 - Speech
- New SpeechLM2 collection
 - Streaming Softformer model
 - Deprecate Confidence Ensemble models
 - parakeet-tdt-0.6b-v3 and canary-1b-v2 models
 - Added chunk inference support with .transcribe() for canary based models
 - Enable prediction of timestamps with streaming ASR
 - Improve ASR models’ invariance to padding/batch size
 - Qwen prompt format support, SALM generation fixes
 - High-level SALM model.generate API closely resembling HF models
 - SALM model initialization with time/memory optimization
 - SpeechLM2: fixed excessive padding, support on-the-fly resampling for SALM
 
 
 - LLM
 - 
Automodel and Export-Deploy functionality are available in their individual repositories respectively and deprecated in NeMo2
 
Detailed Changelogs:
ASR
Changelog
- Modernize logger interface by @emmanuel-ferdman :: PR: #13783
 - Higher-level API for SALM.generate by @pzelasko :: PR: #14034
 - add/refactor docs for asr lm customization by @lilithgrigoryan :: PR: #14088
 - Improve NEST GPU Utilization 1/N by @MahmoudAshraf97 :: PR: #14086
 - Improve ASR models' invariance to padding/batch size by @pzelasko :: PR: #13827
 - Clean up transducer decoding initialization by @artbataev :: PR: #14112
 - Improve NEST GPU Utilization 2/N by @MahmoudAshraf97 :: PR: #14089
 - GPU-accelerated Phrase-Boosting (GPU-PB) for AED decoding by @andrusenkoau :: PR: #14108
 - Fix decoding with ngpu-lm when training (#13994) by @hoangtran9122 :: PR: #13995
 - fix eval_beamsearch_ngram_ctc script by @lilithgrigoryan :: PR: #14238
 - fix wrong typing for ctc-ws context graph by @andrusenkoau :: PR: #14262
 - fix frame vad by @stevehuang52 :: PR: #14337
 - Improve NEST GPU Utilization 3/N by @MahmoudAshraf97 :: PR: #14234
 - remove confidence ensemble models by @lilithgrigoryan :: PR: #14343
 - Fix ASR decoding issues with CUDA graphs in training by @artbataev :: PR: #14184
 - Streaming Sortformer release PR01: uploading bugfixes, refactored variables and yaml file name changes by @tango4j :: PR: #14416
 - Streaming Sortformer release PR02: unit tests for streaming models and modules by @tango4j :: PR: #14417
 - GPU-accelerated Phrase-Boosting (GPU-PB) for CTC, RNN-T, and TDT decoding by @andrusenkoau :: PR: #14277
 - Fix subsampling chunking test by @monica-sekoyan :: PR: #14452
 - Canary2 with NFA by @monica-sekoyan :: PR: #14121
 - Initial Chunking by @nune-tadevosyan :: PR: #14321
 - Chunking fix by @nune-tadevosyan :: PR: #14482
 - Tutorial and doc update by @nune-tadevosyan :: PR: #14484
 - Streaming Sortformer release PR03: NeMo documentations and tutorial notebook by @tango4j :: PR: #14388
 - Add wget_from_nemo by @nune-tadevosyan :: PR: #14623
 - Downgrade "datasets" library version in ASR tutorial to ensure compatibility with HF Datasets used by @KunalDhawan :: PR: #14685
 - Canary tutorial fix by @nune-tadevosyan :: PR: #14708
 - Force activations and weights cast to FP32 Jasper Encoder Squeeze-Excite by @erastorgueva-nv :: PR: #14715
 
TTS
Changelog
NLP / NMT
Changelog
- add extra params for MegatronDataSampler by @dimapihtar :: PR: #13956
 - Modernize logger interface by @emmanuel-ferdman :: PR: #13783
 - remove dialogue collection by @dimapihtar :: PR: #14087
 - remove QA collection by @dimapihtar :: PR: #14092
 - remove text nlp collection by @dimapihtar :: PR: #14110
 - remove nlp modules by @dimapihtar :: PR: #14127
 - remove rag collection by @dimapihtar :: PR: #14157
 - remove nmt collection by @dimapihtar :: PR: #14191
 - Fix importerror in transformer_lm_model after nlp module removals by @chtruong814 :: PR: #14199
 - fix QA comments NVBug by @huvunvidia :: PR: #14196
 - Temporarily Remove Encoder PP Support by @yaoyu-33 :: PR: #14167
 - remove mixins collections by @dimapihtar :: PR: #14281
 - feat: print expert groups on megatron init by @clumsy :: PR: #13874
 - [speechlm2] [lhotse] sharegpt data and testloader by @huckiyang :: PR: #14294
 - Add notebook for LoRA on GPT-OSS-20B by @shashank3959 :: PR: #14439
 - Sketch dist-ckpt content versioning by @mikolajblaz :: PR: #13839
 - Change to enable full iteration CUDA graph for LLMs by @vasunvidia :: PR: #14077
 
Text Normalization / Inverse Text Normalization
Changelog
- Check lightning and core imports in install test by @chtruong814 :: PR: #14403
 
Export
Changelog
- ci: Set L2_NeMo_2_Export_Deploy_Query_In_Framework to be optional by @chtruong814 :: PR: #13946
 - Remove old export doc by @oyilmaz-nvidia :: PR: #14292
 - Llama4 Export: Remove outdated MLP weight transform by @suiyoubi :: PR: #14297
 - Update mllama hf import/export for transformers 4.53 by @meatybobby :: PR: #14327
 
Bugfixes
Changelog
- Bugfix for Hyena to the get_t function which comes up when doing longer context inference by @jstjohn :: PR: #14256
 - fix skipped cuHyena kernel while training by @farhadrgh :: PR: #14365
 - Remove flaky Evo2 dataset performance test by @jstjohn :: PR: #14371
 - Use module prefix in restore_modelopt_state by @jenchen13 :: PR: #14384
 
Uncategorized:
Changelog
- Version bump to 
2.5.0rc0.dev0by @github-actions[bot] :: PR: #13944 - [Llama4] Enable tp comm overlap for llama4 by @gdengk :: PR: #13940
 - Fix for Squad Dataset Download by @rhmukundan :: PR: #13893
 - add nmh HF conversion by @JRD971000 :: PR: #13941
 - Speechlm2 SALM improvements by @pzelasko :: PR: #13829
 - fix dataset issue by @dimapihtar :: PR: #13953
 - Editing MMLU to pull from the correct repo by @ruchaa-apte :: PR: #13991
 - move classes to module to use target feature (#14023) by @nithinraok :: PR: #14031
 - Add Nemotron-H prompt format, fix cut-to-conversation custom attr propagation by @pzelasko :: PR: #13963
 - Bump release_library template to v0.40.0 by @chtruong814 :: PR: #14046
 - [automodel] add support for layer-freezing by @akoumpa :: PR: #14000
 - [Qwen3] Recipe config bug fix by @gdengk :: PR: #14084
 - Add TE import guard in qwen2vl vision module by @chtruong814 :: PR: #14091
 - Update bitsandbytes dependency to v0.46.0 by @pramodk :: PR: #14050
 - Update FSDP2 docstring by @BoxiangW :: PR: #14105
 - Interface to enable fsdp-double-buffer without enabling NCCL-UB by @youngeunkwon0405 :: PR: #14076
 - SpeechLM2 SALM: load ckpt faster, with less GPU memory by @pzelasko :: PR: #14113
 - Add object_storage_cache_path to PreTrainingDataModule by @shunjiad :: PR: #14103
 - Update changelog for 
r2.3.0by @github-actions[bot] :: PR: #14160 - Fix FLUX test with correct env var by @suiyoubi :: PR: #14149
 - add mmap_bin_files param by @dimapihtar :: PR: #14122
 - Add option to suppress import checks in 
Dockerfile.speechby @artbataev :: PR: #14185 - Safely import optional python packages by @roclark :: PR: #13936
 - Set flux test as optional by @chtruong814 :: PR: #14190
 - Revert "Safely import optional python packages (#13936)" by @chtruong814 :: PR: #14197
 - Fix "Safely import optional python packages (#13936)" by @chtruong814 :: PR: #14198
 - Add fix for evo2 generate/inference by @jwilber :: PR: #14027
 - Fixing file path suffix by @gautham-kollu :: PR: #14179
 - Update AVLM finetune example for vanilla fine-tuning by @huvunvidia :: PR: #14232
 - [finetune] Add dataset_kwargs to prepare packed sequence data by @jiajunly :: PR: #14169
 - Allow exception in hf ckpt load attempt before fallback to standard l… by @trvachov :: PR: #14214
 - Load master weights from checkpoint by @kunlunl :: PR: #14072
 - Add deploy lora adapter portion by @ruchaa-apte :: PR: #14255
 - fix speechlm lhotse loading nemo_tarred by @stevehuang52 :: PR: #14314
 - Update changelog for 
r2.4.0by @github-actions[bot] :: PR: #14334 - Flaky test timing out: @pytest.mark.pleasefixme by @pablo-garay :: PR: #14351
 - Support dump perf recipe diff from base recipe by @guyueh1 :: PR: #14206
 - Bugfix degenerate bases evo2 dataset by @jstjohn :: PR: #14359
 - Hyena support for flash decode API by @jstjohn :: PR: #14315
 - Fix Gemma2/3 & Llava (Next) & Llama4 conversion issue with latest transformers by @suiyoubi :: PR: #14367
 - fix: reduce the excessive test time of test_msdd_diar_inference by @tango4j :: PR: #14366
 - SpeechLM2: S2S->S2T data reader, excessive padding fixes by @pzelasko :: PR: #14124
 - chore: Release 2.5.0rc0 by @ko3n1g :: PR: #14389
 - Add pyxis flag for container writable. by @sudostock :: PR: #14395
 - [MoE] Partial Cudagraph support for MoE by @gdengk :: PR: #14362
 - Revert "[MoE] Partial Cudagraph support for MoE (#14362)" by @chtruong814 :: PR: #14402
 - Update AVLM recipes for NeMo-CI runs by @huvunvidia :: PR: #14397
 - Remove nemo1 multimodal and vision by @yaoyu-33 :: PR: #14095
 - Fix LazyNeMoIterator supervision for multi-channel cuts by @anteju :: PR: #14409
 - Bump Mcore to 7f7439f by @chtruong814 :: PR: #14373
 - Use cuhyena rearrange when available. by @moradza :: PR: #14383
 - Fix model training/eval state after PTL validation loop by @paul-gibbons :: PR: #14152
 - Add deprecation notice to eval code by @athitten :: PR: #14316
 - Streaming Sortformer release PR04: Adding functional tests for streaming sortformer by @tango4j :: PR: #14435
 - QWEN2.5-VL 7B Performance Recipe by @tomlifu :: PR: #14401
 - Discount FLOPs in dot-product att by @erhoo82 :: PR: #14424
 - Bump to pytorch 25.06 and newer TE commit by @chtruong814 :: PR: #14423
 - Enable precision aware optimizer for dsv3 by @guyueh1 :: PR: #14444
 - Make VBoost activation conditional by @bdubauski :: PR: #14458
 - cuHyena FFTConv support for Hyena Long Implicit (LI) Layer by @far...
 
NVIDIA Neural Modules 2.4.1
Detailed Changelogs:
Uncategorized:
Changelog
- Update package_info.py by @ko3n1g :: PR: #14400
 - Patch to address issue 14392 by @youngeunkwon0405 :: PR: #14398
 - Cherry pick 
Fix callbacks in DSV3 script (14350)intor2.4.0by @chtruong814 :: PR: #14370 - Cherry pick 
Change Llama Embedding Tutorial to use SFT by default (14231)intor2.4.0by @chtruong814 :: PR: #14303 - Cherrypick 
calculate_per_token_loss requirement for context parallel(#14065) (#14282) intor2.4.0by @chtruong814 :: PR: #14448 - Pin nvidia-lm-eval to 25.6.1 by @chtruong814 :: PR: #14470
 
NVIDIA Neural Modules 2.3.3
- This release addresses known security issues. For the latest NVIDIA Vulnerability Disclosure Information visit https://www.nvidia.com/en-us/security/, for acknowledgement please reach out to the NVIDIA PSIRT team at [email protected]
 - Pin nvidia-lm-eval to 25.5
 
25.09-alpha.rc1
[Flux] Add cuda_graph_scope and cache images ids for full iteration c…
NVIDIA Neural Modules 2.5.0rc0
Prerelease: NVIDIA Neural Modules 2.5.0rc0 (2025-08-03)
NVIDIA Neural Modules 2.4.0
Highlights
- Collections:
- Speech
- Batched beam search for transducers (RNN-T and TDT)
- RNNT/TDT buffered/streaming inference + batched decoding support in cache-aware
 - add support for CTC batched beam search with GPU-LM
 - Key fixes
- Punctuation Marks in Timestamps
 - Fix timestamps when cuda graphs enabled
 - Fix masking of <pad> tokens in AED inference
 - TDT streaming inference fix
 
 
 
 - Batched beam search for transducers (RNN-T and TDT)
 - LLM
- Qwen 3 235B-A22B Perf Optimized
 - DeepSeek V3 Perf Optimized
 - Gemma3 support from Google
 - Embedding and Reranker models
 
 - MM
- Llama 4
 - AVLM
 
 
 - Speech
 - Training performance (speed)
- NVL sharp + IB sharp for DP/FSDP-communications on H100 and B200
 - MXFP8 with TP communication overlap
 - MXFP8 with reduced memory allocation
 - FP8 sub-channel recipe (128x128 for weight and 1x128 for activation)
 - cudnn fused attention for MLA (both Hopper and Blackwell)
 - Advanced custom asymmetric pipelining (for MTP, loss func, and embd)
 - BF16 optimizer for model memory saving
 - CUDA graph fix for fine-tuning benchmarks
 - CUDA graph support for LLAMA4
 
 
Detailed Changelogs
ASR
Changelog
- ci: Fix ASR container by @ko3n1g :: PR: #13288
 - Set L2_Segmentation_Tool_Parallel_ctc_segmentation test to be optional by @chtruong814 :: PR: #13296
 - Revert "WebDataset URL refactoring" by @ko3n1g :: PR: #13421
 - Update flagged docs links by @erastorgueva-nv :: PR: #13391
 - [Docs] Fix incorrectly formatted reference tags by @erastorgueva-nv :: PR: #13445
 - Update CP by @pablo-garay :: PR: #13532
 - Tdt buffered inference fix by @hainan-xv :: PR: #13500
 - Fix transcribe when nbest hypotheses are returned by @lilithgrigoryan :: PR: #13540
 - Set ASR test to be optional by @chtruong814 :: PR: #13633
 - Enabling chunked inference for AED models in asr_evaluator by @melllinia :: PR: #13674
 - Ko3n1g/chore/asr only by @ko3n1g :: PR: #13704
 - decompressing joblib file before checking it by @Ssofja :: PR: #13732
 - Revert "decompressing joblib file before checking it (#13732)" by @chtruong814 :: PR: #13791
 - Punctuation Marks in Timestamps by @monica-sekoyan :: PR: #13353
 - AIStore with Webdataset by @monica-sekoyan :: PR: #13604
 - Update to add default for dataclass variables by @nithinraok :: PR: #13814
 - This PR addresses to known security issues by @Ssofja :: PR: #13804
 - remove model_stride var by @nithinraok :: PR: #13867
 - add CTC batched beam search by @lilithgrigoryan :: PR: #13337
 - Clean up streaming ASR script and tests by @artbataev :: PR: #13894
 - add NGPU-LM fusion during CTC greedy by @lilithgrigoryan :: PR: #13917
 
TTS
Changelog
- Revert "WebDataset URL refactoring" by @ko3n1g :: PR: #13421
 - Update flagged docs links by @erastorgueva-nv :: PR: #13391
 - [Docs] Fix incorrectly formatted reference tags by @erastorgueva-nv :: PR: #13445
 - Update CP by @pablo-garay :: PR: #13532
 - fix: vpp stage refactoring to match mcore by @ZhiyuLi-Nvidia :: PR: #13673
 - AIStore with Webdataset by @monica-sekoyan :: PR: #13604
 
NLP / NMT
Changelog
- Migrate Hyena to Megatron inference_context. by @cspades :: PR: #13436
 - Update CP by @pablo-garay :: PR: #13532
 - fix broken links by @dimapihtar :: PR: #13544
 - Add nlp import checks by @thomasdhc :: PR: #13563
 - PTQ model support, quant_cfg, and documentation updates by @janekl :: PR: #13519
 - feat - GPTSFTChatDataset alignment with OpenAI Messages, compatibility with packed sequences by @soluwalana :: PR: #13367
 - fix: vpp stage refactoring to match mcore by @ZhiyuLi-Nvidia :: PR: #13673
 - Fix resume with MegatronPretrainingBatchSampler by @ashors1 :: PR: #13565
 - Punctuation Marks in Timestamps by @monica-sekoyan :: PR: #13353
 - Revert 
Adding more doc-strings to megatron_parallel.py #12767by @ko3n1g :: PR: #13824 - reasoning model evaluation mmlu gpqa by @ruchaa-apte :: PR: #13880
 - Remove unused DynamicRetrievalServer and Bert dataset loader classes by @dimapihtar :: PR: #14209
 - Huvu/avlm qafix cherrypick from by @huvunvidia :: PR: #14253
 
Export
Changelog
- Improve Nemo2Exporter for Models Using Custom Modelling Files on HF by @suiyoubi :: PR: #13400
 - Adding more export tests by @oyilmaz-nvidia :: PR: #13410
 - Add Warning to Export when output_path exists by @suiyoubi :: PR: #13465
 - Move libsox-fmt-all from Dockerfile.ci.export_deploy to Dockerfile.ci by @chtruong814 :: PR: #13452
 - ci: Remove trt-llm breakpoint by @ko3n1g :: PR: #13499
 - Add Qwen2VL export_ckpt by @AtsunoriFujita :: PR: #13398
 - Add MLlama export_ckpt by @AtsunoriFujita :: PR: #13346
 - Update vLLMExporter to use vLLM V1 by @janekl :: PR: #13498
 - Add vLLM Mixtral and TRT-LLM qnemo export tests (plus a couple of bugfixes) by @janekl :: PR: #13697
 - Fix Qwen3 export + misc by @cuichenx :: PR: #13679
 - Extra int cast for successful tracing during ONNX export by @janekl :: PR: #13782
 - FP8 lora export by @cuichenx :: PR: #13748
 - Add PEFT export check by @cuichenx :: PR: #13835
 - Update llm api import_ckpt/export_ckpt docstring by @meatybobby :: PR: #13714
 - Use modelopt export and disable dataset calibration for weight only PTQ by @jenchen13 :: PR: #13756
 
Bugfixes
Uncategorized
Changelog
- build: various bumps by @ko3n1g :: PR: #13285
 - ci: Fixes to selective triggering by @ko3n1g :: PR: #13287
 - ci: Set timeout by @ko3n1g :: PR: #13294
 - Set L2_NeMo_2_T5_Pretraining test as optional by @chtruong814 :: PR: #13282
 - Add test environment approval step for CI by @chtruong814 :: PR: #13297
 - update num nodes in deepseek v3 finetune recipe by @cuichenx :: PR: #13314
 - ci: Increase cache pool by @ko3n1g :: PR: #13306
 - Rename adam_with_cosine_annealing as adam since cosin LR is not setup by @ShriyaRishab :: PR: #13315
 - ci: Update test queue bot to not assume a workflow is launched from a PR by @chtruong814 :: PR: #13318
 - Fix TE pytorch attention doc link by @thomasdhc :: PR: #13327
 - ci: Add all recent buildcaches to update-buildcache job by @ko3n1g :: PR: #13289
 - Fix neva notebook by @yaoyu-33 :: PR: #13334
 - Fix transformer offline for CI/CD llama4 tests by @yaoyu-33 :: PR: #13339
 - [automodel] convert lm head to full tensor before passing to lce by @yuanzhedong :: PR: #13319
 - ci: No dups in queue by @ko3n1g :: PR: #13352
 - ci(hotfix): VLM CPU unit tests by @ko3n1g :: PR: #13348
 - vLLM==0.8.5 update by @janekl :: PR: #13350
 - ci: Allow bypassing approval by @ko3n1g :: PR: #13365
 - Avoid the need to specify optional attributes for lhotse/nemo reader functions by @pzelasko :: PR: #13307
 - ci: Fix selective-triggering for non-PR events by @ko3n1g :: PR: #13374
 - ci: Revert 
no-concurrency-group-on-mainby @ko3n1g :: PR: #13375 - ci: Improve no-fail-fast mechanism by @ko3n1g :: PR: #13370
 - 2d buckets estimation fix by @monica-sekoyan :: PR: #13377
 - ci: Fix scheduled runs by @ko3n1g :: PR: #13378
 - Ko3n1g/ci/fix nightly runs by @ko3n1g :: PR: #13382
 - [automodel] fix none issue in dataset for qwen model by @yuanzhedong :: PR: #13311
 - update table by @akoumpa :: PR: #13397
 - Improve test coverage for audio modules by @anteju :: PR: #13333
 - Disable failing maxine loss test by @anteju :: PR: #13361
 - Ko3n1g/ci/no notification on cancel by @ko3n1g :: PR: #13403
 - document fp8_recipe by @akoumpa :: PR: #13405
 - Weekly bump main by @ko3n1g :: PR: #13408
 - Handle boolean args for performance scripts and log received config by @guyueh1 :: PR: #13291
 - [automodel] add FirstRankPerNode by @akoumpa :: PR: #13373
 - tests: Disable flaky audio test by @ko3n1g :: PR: #13429
 - ci: Disable flaky audio test by @ko3n1g :: PR: #13435
 - Fix loss compute and reduction by @xrennvidia :: PR: #13295
 - ci: Skip link check on github links by @chtruong814 :: PR: #13425
 - Add NCCL cfg interface to perf scripts by @erhoo82 :: PR: #13407
 - ci: Success only if 
Run CICDlabel attached by @ko3n1g :: PR: #13430 - ci: Add tests to selective triggering by @ko3n1g :: PR: #13404
 - ci: Remove jq by @ko3n1g :: PR: #13440
 - ci: Fix deps tree for tests by @ko3n1g :: PR: #13443
 - Ko3n1g/ci/fix dependency tree by @ko3n1g :: PR: #13448
 - Adding additional unit tests for the deploy module by @pthombre :: PR: #13411
 - [Audio] fix a flaky test (and also make some tests run faster) by @racoiaws :: PR: #13439
 - [automodel] ignore tail padding in TPS calculation by @akoumpa :: PR: #13329
 - Ko3n1g/ci/selective triggering 3 by @ko3n1g :: PR: #13460
 - ci: Disable broken neva tests by @ko3n1g :: PR: #13461
 - fix speechlm data module by @stevehuang52 :: PR: #13362
 - ci: Enter queue only with passing linting by @ko3n1g :: PR: #13462
 - Adding tests for Schroedinger Bridge model by @nasretdinovr :: PR: #13401
 - add more detailed description by @dimapihtar :: PR: #13464
 - [Audio] tests for score-based and flow matching enhancement models by @racoiaws :: PR: #13406
 - Use expandable cuda memory segmentation by @erhoo82 :: PR: #13418
 - Fix llava tokenizer caused nan issue by @yaoyu-33 :: PR: #13466
 - Remove cuda method from ModelPT by @erastorgueva-nv :: PR: #13394
 - Fix BNR 2 unit test + input, case where input length was not specified by @nitin9252 :: PR: #13467
 - ci: Do not run any tests if no match is found by @ko3n1g :: PR: #13479
 - Ko3n1g/ci/selective triggering 4 by @ko3n1g :: PR: #13489
 - Fix typo in the performance script by @youngeunkwon0405 :: PR: #13487
 - ci: No runs on main by @ko3n1g :: PR: #13490
 - ci: Upload on schedule by @ko3n1g :: PR: #13491
 - ci: Run selective triggering on dockerfiles and dependencies by @ko3n1g :: PR: #13493
 - [automodel] fallback FP8 + LCE -> FP8 + CE by @akoumpa :: PR: #13349
 - Update changelog for 
r2.3.0by @github-actions[bot] :: PR: #13501 - Update 2.3.0...
 
NVIDIA Neural Modules 2.3.2
This release addresses known security issues. For the latest NVIDIA Vulnerability Disclosure Information visit https://www.nvidia.com/en-us/security/, for acknowledgement please reach out to the NVIDIA PSIRT team at [email protected]