[BugFix] Modify max_tokens and modify the log and fix #1103#1097
[BugFix] Modify max_tokens and modify the log and fix #1103#1097hsliuustc0106 merged 2 commits intovllm-project:mainfrom
Conversation
| temperature: 0.9 | ||
| top_k: 50 | ||
| max_tokens: 4096 | ||
| max_tokens: 2048 # TODO: The max_tokens of the async_chunk feature cannot exceed 2048. |
There was a problem hiding this comment.
is this model specific?
6b551cf to
17bbd40
Compare
5aeb593 to
c79fb40
Compare
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0d8e9d8679
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| lock_file = f"/dev/shm/shm_{put_key}_lockfile.lock" | ||
| with open(lock_file, "w") as lockf: | ||
| fcntl.flock(lockf, fcntl.LOCK_EX) | ||
| meta = shm_write_bytes(payload, name=put_key) | ||
| fcntl.flock(lockf, fcntl.LOCK_UN) |
There was a problem hiding this comment.
Clean up per-chunk lock files to avoid /dev/shm leaks
This change creates a new lock file under /dev/shm for every put_key (request+stage+chunk) but there is no corresponding deletion anywhere in the connector. On long-running servers or high‑throughput workloads, these lock files will accumulate indefinitely and can exhaust /dev/shm inode/space limits, which will then cause future SHM writes/locks to fail. This only happens when many chunks are produced, but it’s a realistic production scenario for streaming workloads.
Useful? React with 👍 / 👎.
Signed-off-by: amy-why-3459 <[email protected]>
0d8e9d8 to
2b9c6e1
Compare
… (vllm-project#1097) Signed-off-by: amy-why-3459 <[email protected]> Co-authored-by: Hongsheng Liu <[email protected]>
PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.
Purpose
Fix #1099
Test Plan
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)