-
Notifications
You must be signed in to change notification settings - Fork 117
Change /unload to not wait for inflight requests (#125) #126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Sometimes upstreams can accept HTTP but never respond causing requests to build up waiting for a response. This can block Process.Stop() as that waits for inflight requests to finish. This change refactors the code to not wait when attempting to shutdown the process.
WalkthroughThe changes introduce a Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant ProxyManager
participant ProcessGroup
participant Process
Client->>ProxyManager: Request to reload config / unload models
ProxyManager->>ProcessGroup: StopProcesses(strategy)
alt StopWaitForInflightRequest
ProcessGroup->>Process: Stop()
Process->>Process: Wait for inflight requests
Process->>Process: StopImmediately()
else StopImmediately
ProcessGroup->>Process: StopImmediately()
end
Process->>ProcessGroup: Stopped
ProcessGroup->>ProxyManager: All stopped
ProxyManager->>Client: Reload/unload complete
Possibly related PRs
Tip ⚡️ Faster reviews with caching
Enjoy the performance boost—your workflow just got faster. ✨ Finishing Touches
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
proxy/process.go (2)
33-38: New stop strategy enumeration.A new
StopStrategytype is introduced with two constants to represent different stopping behaviors:
StopImmediately: Stop the process immediately without waiting for in-flight requestsStopWaitForInflightRequest: Wait for in-flight requests to complete before stoppingConsider adding more detailed documentation for the
StopStrategytype to explain the semantics of each strategy:// StopStrategy defines the strategy to stop processes. +// It controls whether a process should be stopped immediately or wait for in-flight +// requests to complete first. type StopStrategy int const ( + // StopImmediately terminates the process without waiting for in-flight requests StopImmediately StopStrategy = iota + // StopWaitForInflightRequest waits for all in-flight requests to complete before terminating StopWaitForInflightRequest )
335-356: NewStopImmediately()method for immediate process termination.This new method implements the actual process stopping logic without waiting for in-flight requests, allowing the
/unloadendpoint to stop processes without being blocked by hanging requests.The logging is slightly inconsistent. The debug message uses "Stopping process" while the info message refers to "Stop()". Consider updating the info messages to be clearer about which method is being called:
- p.proxyLogger.Infof("<%s> Stop() Ready -> StateStopping err: %v, current state: %v", p.ID, err, curState) + p.proxyLogger.Infof("<%s> StopImmediately() Ready -> StateStopping err: %v, current state: %v", p.ID, err, curState)Similarly for the second info message:
- p.proxyLogger.Infof("<%s> Stop() StateStopping -> StateStopped err: %v, current state: %v", p.ID, err, curState) + p.proxyLogger.Infof("<%s> StopImmediately() StateStopping -> StateStopped err: %v, current state: %v", p.ID, err, curState)
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
llama-swap.go(1 hunks)proxy/process.go(3 hunks)proxy/process_test.go(1 hunks)proxy/processgroup.go(2 hunks)proxy/processgroup_test.go(2 hunks)proxy/proxymanager.go(4 hunks)proxy/proxymanager_test.go(10 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (4)
llama-swap.go (1)
proxy/process.go (1)
StopWaitForInflightRequest(37-37)
proxy/process_test.go (1)
proxy/process.go (4)
NewProcess(72-96)StateReady(23-23)StopImmediately(36-36)StateStopped(21-21)
proxy/processgroup.go (1)
proxy/process.go (2)
StopStrategy(33-33)StopImmediately(36-36)
proxy/proxymanager.go (1)
proxy/process.go (3)
StopStrategy(33-33)StopWaitForInflightRequest(37-37)StopImmediately(36-36)
🔇 Additional comments (13)
proxy/processgroup_test.go (2)
49-49: Updated to use explicit stop strategy parameter.The code has been correctly modified to pass
StopWaitForInflightRequestto the updatedStopProcessesmethod, ensuring that test teardown maintains the expected behavior of waiting for in-flight requests to complete.
77-77: Updated to use explicit stop strategy parameter.This change correctly passes
StopWaitForInflightRequestto theStopProcessesmethod in the second test, maintaining consistent test teardown behavior.llama-swap.go (1)
87-87: Updated configuration reload to wait for in-flight requests.The code has been correctly modified to explicitly pass
proxy.StopWaitForInflightRequestwhen stopping processes during configuration reload. This ensures that the server waits for any in-progress requests to complete before updating the configuration, maintaining a smooth user experience during reloads.proxy/process_test.go (1)
376-395: Added test for immediate process termination.The new test correctly validates that
StopImmediately()can terminate a process with in-flight requests without waiting for them to complete. The test correctly:
- Starts a process and verifies it reaches
StateReady- Launches a slow request in a separate goroutine
- Calls
StopImmediately()and verifies the process transitions toStateStoppedimmediatelyThis test is crucial for validating the core functionality of this PR, which is to allow immediate termination without waiting for in-flight requests.
proxy/processgroup.go (2)
79-79: Updated method signature to support different stop strategies.The
StopProcessesmethod signature has been correctly modified to accept aStopStrategyparameter, allowing callers to control whether processes should be stopped immediately or should wait for in-flight requests to complete.
93-98: Added strategy-based process stopping logic.The implementation correctly handles different stop strategies:
StopImmediately: Callsprocess.StopImmediately()to terminate the process without waiting- Default (including
StopWaitForInflightRequest): Callsprocess.Stop()to gracefully wait for requestsThis implementation maintains backward compatibility by defaulting to the waiting behavior while adding support for immediate termination when needed.
proxy/proxymanager_test.go (2)
30-30: Parameter added to match new signature.The
StopWaitForInflightRequestparameter is now required when callingStopProcesses()and will ensure the test cleanup waits for in-flight requests to complete before stopping processes.
66-66: Test cleanup now uses explicit stop strategy.The test cleanup consistently uses the
StopWaitForInflightRequeststrategy across all test functions, which is appropriate for tests as we want to ensure all test-initiated requests complete before stopping the processes.Also applies to: 108-108, 144-144, 355-355, 410-410, 464-464, 560-560, 589-589, 607-607
proxy/proxymanager.go (3)
221-221: Strategy propagated to process groups.The stop strategy is now properly propagated to each process group, ensuring consistent stopping behavior.
507-507:/unloadendpoint now stops processes immediately.This is the key change in the PR - the
/unloadendpoint now usesStopImmediatelyso it will no longer block waiting for in-flight requests that might be hanging, addressing the issue with unresponsive upstream servers.
263-263:❓ Verification inconclusive
Exclusive mode uses wait strategy when stopping other groups.
When a process group is running in exclusive mode, other process groups are stopped using the
StopWaitForInflightRequeststrategy. This ensures that in-flight requests to other groups complete before stopping them.Is this behavior intentional? It means that even if the caller wants to stop everything immediately, groups stopped due to exclusivity will still wait for in-flight requests. Run this script to check if there are any other places in the codebase where process groups are stopped:
🏁 Script executed:
#!/bin/bash # Find all calls to StopProcesses to verify the strategy usage rg -A 2 -B 2 'StopProcesses' --glob '*.go'Length of output: 5596
Confirm hardcoded wait strategy on non-target process groups
In
ProxyManager.StopProcesses(proxy/proxymanager.go:263), any “other” group is always stopped withStopWaitForInflightRequest, regardless of thestrategypassed in. Concretely:
- In
unloadAllModelsHandler(proxy/proxymanager.go), we callpm.StopProcesses(StopImmediately), but all other groups still useStopWaitForInflightRequest.- In
llama-swap.go, upgrades also unconditionally useStopWaitForInflightRequest.- No existing test covers the
StopImmediatelypath for non-target groups.Is it intentional to always drain in-flight requests on secondary groups? If you want to respect the caller’s strategy, update the code to:
- otherGroup.StopProcesses(StopWaitForInflightRequest) + otherGroup.StopProcesses(strategy)proxy/process.go (2)
323-333:Stop()now waits for in-flight requests before stopping.The
Stop()method has been refactored to explicitly wait for in-flight requests to complete before delegating toStopImmediately(). This improves separation of concerns and makes the code more maintainable.
360-361: Enhanced docstring clarification.The docstring now clearly states that once a process reaches the
StateShutdownstate, it cannot be restarted, which is an important detail for maintainers to understand.
Sometimes upstreams can accept HTTP but never respond causing requests to build up waiting for a response. This can block Process.Stop() as that waits for inflight requests to finish. This change refactors the code to not wait when attempting to shutdown the process.
Summary by CodeRabbit