-
-
Notifications
You must be signed in to change notification settings - Fork 11.8k
[WIP] Enable async scheduling by default #27614
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request enables asynchronous scheduling by default, which is a significant and beneficial change for performance. The implementation involves refactoring the configuration handling to introduce a disable_async_scheduling flag and automatically disabling async scheduling for incompatible features like pipeline parallelism and speculative decoding. The core execution flow is also refactored into a two-step process (execute_model and sample_tokens) to support async structured outputs. The changes are extensive and touch many parts of the codebase, including configuration, core engine logic, executors, and tests.
My review has identified a critical issue with the configuration validation that could lead to unexpected hard failures instead of the intended automatic disabling of async scheduling. I've also found a potential issue in the KV output aggregation logic that might not handle None outputs from all workers correctly. I've provided suggestions to address these issues.
The rest of the changes, including test updates, seem correct and consistent with the goal of this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
a193663 to
b027710
Compare
70a0041 to
b42e2d2
Compare
b42e2d2 to
467aef5
Compare
62677ea to
8487595
Compare
c3fcf6f to
92e1818
Compare
92e1818 to
8aeef39
Compare
cced487 to
ca906f8
Compare
Signed-off-by: Nick Hill <[email protected]>
ca906f8 to
a007dde
Compare
Currently just doing full CI test to flush out any issues.