Skip to content

Conversation

@gongshaotian
Copy link

No description provided.

lizhenyun01 and others added 13 commits September 24, 2025 21:32
* support cudagraph use shared pool

* add envs

* change CUDAGRAPH_POOL_ID to int

* change CUDAGRAPH_POOL_ID to use_memory_pool

* unify use_unique_memory_pool

* fix use_unique_memory_pool
* delete default value reasoning_max_tokens

* Adjust max_tokens and reasoning_max_tokens logic
* fix

* fix

* fix

* [Feature] support clear data

* update

* fix

* fix

* fix

* fix

* [BugFix] fix clear data

* Update api_server.py

* Update api_server.py

---------

Co-authored-by: Jiang-Jia-Jun <[email protected]>
…for mtp (PaddlePaddle#4189)

* fix top_p_candidates

* For separate setting params for mtp

* delete print

* fix
* fix

* fix

* fix

* [Feature] support clear data

* update

* fix

* fix

* fix

* fix

* [BugFix] fix clear data

* Update api_server.py

* Update api_server.py

* [Feature] support fd decode response

* Update engine.py

* Update envs.py

* Update expert_service.py

* Update common_engine.py

---------

Co-authored-by: Jiang-Jia-Jun <[email protected]>
Co-authored-by: ltd0924 <[email protected]>
This reverts commit 3cfe837.
@littledgg littledgg merged commit 0b3efa3 into littledgg:0908mtp Oct 9, 2025
@gongshaotian gongshaotian deleted the 0908mtp branch November 3, 2025 06:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants