llama-swap should never need to be restarted due to upstream issues #155

mostlygeek · 2025-06-05T18:11:17Z

When upstream processes fail, crash or get stuck for some reason the only way to resolve it is to restart llama-swap. Reliability is a goal and having to restart llama-swap due to upstream process issues goes against that. Ideally, llama-swap should only need to be restarted for:

configuration changes
upgrading versions.

Since the Process management code is quite complex at this point the top line goals are:

remove StateFailed so upstream processes will always be retried. Make it an operator task to resolve starting issues (ref: Failed state considered unrecoverable? #120)
Simplify start/stop/shutdown with golang's built in exec.CommandContext. This hopefully will reduce code as well.
None/little changes to current test suite

Summary by CodeRabbit

New Features
- Introduced a new executable program for testing process termination scenarios, including handling of signals and process timeouts.
Refactor
- Improved process management by integrating context-based cancellation for cleaner shutdown and stop operations.
- Enhanced process lifecycle handling with dedicated wait routines and consolidated stop logic for better reliability.
- Removed the failed state from process lifecycle, simplifying state transitions and request handling.
- Streamlined health check and error logging for more straightforward process monitoring.
Bug Fixes
- Updated process state handling to correctly reflect stopped status after upstream command exits prematurely but successfully.
- Adjusted error responses for broken model configurations to improve clarity and consistency.

- remove Process.shutdownCtx, Process.shutdownCancel - simplify logic by checking Process.CurrentState() instead of more complex channel management

coderabbitai · 2025-06-05T18:11:23Z

"""

Walkthrough

A new Go program is introduced to test process termination behaviors using exec.Cmd.CommandContext. Additionally, the shutdown context and cancellation mechanism are removed from the Process struct in the proxy, simplifying shutdown management to rely solely on process state checks and context cancellation with a custom cancel function.

Changes

File(s)	Change Summary
misc/process-cmd-test/main.go	Added a new Go program for testing process termination scenarios and signal handling with subprocesses.
proxy/process.go	Removed shutdown context/cancel fields and logic from `Process`; replaced inline wait goroutine with `waitForCmd()` method; introduced `cancelUpstream` context cancel func and custom `Cancel` for `exec.CommandContext`; refactored shutdown and stop logic to use context cancellation and process state checks; removed `StateFailed` state and related logic.
proxy/process_test.go	Updated test `TestProcess_ExitInterruptsHealthCheck` to expect `StateStopped` instead of `StateFailed` after upstream command exits prematurely but successfully; removed tests involving `StateFailed`; adjusted error expectations in `TestProcess_BrokenModelConfig`.
proxy/proxymanager.go	Removed handling of `StateFailed` in process state string representation, causing `StateFailed` to be shown as "Unknown".

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant MainProgram
    participant ChildProcess

    User->>MainProgram: Start program
    MainProgram->>ChildProcess: Start child process (with/without SIGTERM ignore)
    MainProgram->>MainProgram: Set up signal listener (SIGINT, SIGTERM)
    User->>MainProgram: Send OS signal (SIGINT/SIGTERM)
    MainProgram->>MainProgram: Cancel context (triggers custom cancel logic)
    MainProgram->>ChildProcess: Send SIGTERM
    alt Child ignores SIGTERM
        MainProgram->>ChildProcess: Wait 500ms, then send SIGKILL
    end
    ChildProcess-->>MainProgram: Exit
    MainProgram->>User: Print exit status and errors

Possibly related PRs

Add stopCmd to model configuration #136: Adds a configurable stop command and updates cmdStopUpstreamProcess to run it or send SIGTERM; relates directly to the new stop logic refactoring in proxy/process.go.
Prevent StateFailed after SIGKILL #129: Refactors process shutdown handling by removing shutdown context and introducing a dedicated waitForCmd method, modifying command wait logic in proxy/process.go.
Change /unload to not wait for inflight requests (#125) #126: Adds stop strategies and splits stop logic into immediate and inflight-wait variants; related to but distinct from the current refactoring of process shutdown using context cancellation and custom cancel function.
"""

📜 Recent review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8f68dd6 and df36c41.

📒 Files selected for processing (3)

proxy/process.go (12 hunks)
proxy/process_test.go (3 hunks)
proxy/proxymanager.go (0 hunks)

💤 Files with no reviewable changes (1)

proxy/proxymanager.go

🚧 Files skipped from review as they are similar to previous changes (1)

proxy/process_test.go

🧰 Additional context used

🧠 Learnings (1)

proxy/process.go (1)

Learnt from: mostlygeek
PR: mostlygeek/llama-swap#155
File: proxy/process.go:359-372
Timestamp: 2025-06-05T22:57:53.766Z
Learning: In Go, cmdWaitChan in proxy/process.go is a signal channel (chan struct{}) that follows the pattern where waitForCmd() closes the channel when cmd.Wait() returns, and stopCommand() waits on it. Reading from a closed channel returns immediately, so <-cmdWaitChan won't block indefinitely.

⏰ Context from checks skipped due to timeout of 90000ms (1)

GitHub Check: run-tests

🔇 Additional comments (12)

proxy/process.go (12)

43-47: Good simplification of the cancellation mechanism.

The change from chan error to chan struct{} aligns with the signal channel pattern and the introduction of cancelUpstream provides a cleaner cancellation mechanism using Go's context.

68-73: LGTM: Maintaining backward compatibility during migration.

The comments and field definitions look good for the transition period.

85-85: Constructor properly initializes new fields.

The initialization of cancelUpstream to nil and cmdWaitChan as a struct channel is correct for the new design.

Also applies to: 95-99

140-140: Correctly removes StateFailed from valid transitions.

The updated transition rule properly eliminates StateFailed while maintaining logical state flow.

208-210: Properly handles start failure without StateFailed.

The change to transition to StateStopped instead of StateFailed on start failure aligns with the PR objective to always retry processes.

219-219: Good refactoring to use dedicated waitForCmd method.

Moving the command waiting logic to a separate method improves code organization and testability.

302-303: Proper reset of failure count on successful start.

Resetting failedStartCount to 0 on successful start maintains proper retry tracking.

327-333: Clean simplification of StopImmediately method.

The streamlined logic with a single call to stopCommand() improves clarity and maintainability.

344-347: Appropriate simplification for shutdown logic.

The streamlined approach with direct state setting for the terminal StateShutdown state is correct.

351-364: Excellent simplification leveraging Go's cancellation mechanism.

The new implementation is much cleaner, using cancelUpstream() and the signal channel pattern. This aligns with the learning that cmdWaitChan is safe to wait on since it gets closed when the command exits.

396-396: Correctly removes StateFailed from request blocking conditions.

Removing StateFailed from the blocking states aligns with the PR objective to always retry processes instead of permanently failing them.

516-558: Comprehensive graceful stop implementation.

The method provides excellent graceful stopping functionality with proper Windows support, PID replacement, and error handling. This enhances the reliability of process management.

✨ Finishing Touches

📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Explain this complex logic.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai explain this code block.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and explain its main purpose.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 2

🔭 Outside diff range comments (3)

proxy/process.go (3)
139-139: 🛠️ Refactor suggestion

⚠️ Potential issue

Remove StateFailed transitions per PR objectives

The PR objectives require removing StateFailed to ensure automatic retries. However, the code still transitions to StateFailed in multiple places:

Line 139: StateStarting → StateFailed transition

Lines 203-209: Command start failure → StateFailed

Lines 269-273: Health check timeout → StateFailed

Lines 277-281: Command exit with error → StateFailed

Lines 284-288: Command exit without error → StateFailed

These should transition to StateStopping instead to align with the PR's reliability goals.

Also applies to: 203-209, 269-273, 277-281, 284-288

394-394: ⚠️ Potential issue

Use swapState() for thread-safe state transitions

Directly assigning to p.state bypasses thread safety and state transition validation. This could lead to race conditions.
-	// just force it to this state since there is no recovery from shutdown
-	p.state = StateShutdown
+	if curState, err := p.swapState(StateStopping, StateShutdown); err != nil {
+		p.proxyLogger.Warnf("<%s> Shutdown() failed to transition to StateShutdown: %v, current state: %v", p.ID, err, curState)
+	}
28-28: 💡 Verification agent
🧩 Analysis chain

StateFailed should be removed according to PR objectives

The PR objectives state that StateFailed status should be removed so that upstream processes will always be retried automatically. However, this constant is still defined in the code.

🏁 Script executed:
#!/bin/bash
# Description: Check for remaining references to StateFailed in the codebase

# Search for StateFailed usage
ast-grep --pattern 'StateFailed'

# Also check with ripgrep for any string references
rg "StateFailed" -A 2 -B 2
Length of output: 6796
Remove all StateFailed references as per PR objectives

We still have numerous definitions and usages of StateFailed that contradict the goal of always retrying upstream processes. Please remove StateFailed (constant, state‐machine transitions, error handling, and tests) and adjust the logic to skip any “failed” state.

Key locations to update:

proxy/process.go

Line 28: StateFailed constant definition

Line 139–144: isValidTransition cases including StateFailed

Swap‐state calls around lines 203, 269, 277, 284, 362–363 for transitions into/out of StateFailed

Health‐check and exit‐error branches (lines 262–265, 286–295) that swap to StateFailed

Stop logic at lines 362–365 and request handling at lines 507–510

proxy/proxymanager.go

Case for StateFailed in status‐string mapping (around line 351)

proxy/process_test.go

All test cases expecting StateFailed transitions or current state

Ensure the state machine, logging, HTTP responses, and tests are updated so that failures immediately trigger retry behavior instead of a “failed” terminal state.

🧹 Nitpick comments (1)

misc/process-cmd-test/main.go (1)

23-24: Consider orphaned process handling in production code

The test reveals that child processes become orphaned when the parent is killed with SIGKILL. This could lead to resource leaks in production if llama-swap crashes unexpectedly.

Consider implementing a process group or using cmd.SysProcAttr with Setpgid to ensure child processes are cleaned up even when the parent crashes. This would improve the reliability goals mentioned in the PR objectives.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a84098d and c699aab.

📒 Files selected for processing (2)

misc/process-cmd-test/main.go (1 hunks)
proxy/process.go (1 hunks)

🧰 Additional context used

🪛 golangci-lint (1.64.8)

misc/process-cmd-test/main.go

52-52: Error return value of cmd.Process.Signal is not checked

(errcheck)

proxy/process.go

coderabbitai · 2025-06-05T18:15:35Z

misc/process-cmd-test/main.go

+	cmd.Cancel = func() error {
+		fmt.Println("✔︎ Cancel() called, sending SIGTERM")
+		cmd.Process.Signal(syscall.SIGTERM)
+		return nil
+	}


⚠️ Potential issue

Check error return value from Signal()

The error return value from cmd.Process.Signal should be checked, even in test code, to help debug signal delivery issues.

cmd.Cancel = func() error { fmt.Println("✔︎ Cancel() called, sending SIGTERM") - cmd.Process.Signal(syscall.SIGTERM) - return nil + if err := cmd.Process.Signal(syscall.SIGTERM); err != nil { + fmt.Printf("Failed to send SIGTERM: %v\n", err) + return err + } + return nil }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

cmd.Cancel = func() error {

fmt.Println("✔︎ Cancel() called, sending SIGTERM")

cmd.Process.Signal(syscall.SIGTERM)

return nil

}

cmd.Cancel = func() error {

fmt.Println("✔︎ Cancel() called, sending SIGTERM")

if err := cmd.Process.Signal(syscall.SIGTERM); err != nil {

fmt.Printf("Failed to send SIGTERM: %v\n", err)

return err

}

return nil

}

🧰 Tools

🪛 golangci-lint (1.64.8)

52-52: Error return value of cmd.Process.Signal is not checked

(errcheck)

🤖 Prompt for AI Agents

In misc/process-cmd-test/main.go around lines 50 to 54, the error returned by cmd.Process.Signal(syscall.SIGTERM) is not checked. Modify the Cancel function to capture the error returned by Signal, check if it is non-nil, and handle it appropriately, such as logging the error or returning it, to ensure any issues with signal delivery are detected and can be debugged.

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (1)

misc/process-cmd-test/main.go (1)
51-54: ⚠️ Potential issue

Check error return value from Signal()

The error return value from cmd.Process.Signal should be checked, even in test code, to help debug signal delivery issues.
 	cmd.Cancel = func() error {
 		fmt.Println("✔︎ Cancel() called, sending SIGTERM")
-		cmd.Process.Signal(syscall.SIGTERM)
+		if err := cmd.Process.Signal(syscall.SIGTERM); err != nil {
+			fmt.Printf("Failed to send SIGTERM: %v\n", err)
+			return err
+		}
 
 		//return nil
🧰 Tools

🪛 golangci-lint (1.64.8)

53-53: Error return value of cmd.Process.Signal is not checked

(errcheck)

🧹 Nitpick comments (1)

misc/process-cmd-test/main.go (1)

40-44: Make the test binary path configurable for cross-platform compatibility.

The hardcoded path ../../build/simple-responder_darwin_arm64 is platform-specific. Consider making it configurable via command-line arguments or environment variables to support testing on different platforms.

+import "flag"
+
+var binaryPath = flag.String("binary", "../../build/simple-responder_darwin_arm64", "Path to the test binary")
+
 func main() {
+    flag.Parse()
 
     // swap between these to use kill -9 <pid> on the cli to sim external crash
     ctx, cancel := context.WithCancel(context.Background())
     //ctx, cancel := context.WithTimeout(context.Background(), 1000*time.Millisecond)
     defer cancel()
 
     //cmd := exec.CommandContext(ctx, "sleep", "1")
     cmd := exec.CommandContext(ctx,
-        "../../build/simple-responder_darwin_arm64",
+        *binaryPath,
         //"-ignore-sig-term", /* so it doesn't exit on receiving SIGTERM, test cmd.WaitTimeout */
     )

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1e16c83 and 16da4c3.

📒 Files selected for processing (2)

misc/process-cmd-test/main.go (1 hunks)
proxy/process.go (9 hunks)

🧰 Additional context used

🧬 Code Graph Analysis (1)

proxy/process.go (1)

proxy/config.go (1)

SanitizeCommand (297-330)

🪛 golangci-lint (1.64.8)

misc/process-cmd-test/main.go

53-53: Error return value of cmd.Process.Signal is not checked

(errcheck)

⏰ Context from checks skipped due to timeout of 90000ms (1)

GitHub Check: run-tests

🔇 Additional comments (7)

misc/process-cmd-test/main.go (1)

68-91: Well-structured test program for process lifecycle management.

The main execution flow properly handles command lifecycle, signal handling, and provides clear logging for debugging different termination scenarios.
proxy/process.go (6)

46-78: Good migration path with clear deprecation markers.

The addition of cancelUpstream and clear documentation of fields to be removed provides a clean migration path to exec.CommandContext.

201-226: Correct implementation of exec.CommandContext with proper lifecycle management.

The command setup properly configures cancellation context, custom Cancel function, and wait delay. The waitForCmd() goroutine ensures proper handling of command exit.

392-428: Clean refactoring to use context cancellation.

The simplified stopCommand() properly uses the new cancellation mechanism and handles various exit scenarios comprehensively.

544-565: Well-designed command exit handler with proper race condition handling.

The waitForCmd() method correctly handles command exit, state transitions, and race conditions with SIGKILL.

373-388: Excellent simplification of stop and shutdown logic.

The refactored methods are much cleaner and rely on the context cancellation mechanism, aligning perfectly with the PR objectives.

567-609:
✅ Verification successful

Comprehensive graceful shutdown implementation with platform-specific handling.

The cmdStopUpstreamProcess() method properly handles:

Windows-specific stop commands with taskkill

Custom stop command execution with PID substitution

Fallback to SIGTERM for Unix systems

Proper error handling and logging

Let me verify that the SanitizeCommand function properly handles the PID-substituted command string:

🏁 Script executed:
#!/bin/bash
# Description: Verify SanitizeCommand usage and implementation
# Test: Check if SanitizeCommand is defined and handles command strings properly
ast-grep --pattern 'func SanitizeCommand($_) $$$'
Length of output: 1587
Review Approved: cmdStopUpstreamProcess() and SanitizeCommand Verified

The cmdStopUpstreamProcess() method correctly handles:

Windows default (taskkill /f /t /pid ${PID}) and custom stop commands

PID substitution followed by proper sanitization via SanitizeCommand (proxy/config.go:297–330)

OS-specific argument splitting (Posix vs. Windows)

Fallback to SIGTERM on Unix

Thorough error handling and logging

No further changes are required.

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (1)

proxy/process.go (1)

260-260: Error message could be more accurate.

The error message "health check interrupted due to shutdown" may be misleading since the health check can be interrupted when transitioning to states other than shutdown (e.g., when the upstream exits and transitions to StateStopped).

🧹 Nitpick comments (2)

proxy/process.go (2)

532-535: Consider logging when forcing StateStopped.

While forcing StateStopped ensures the process ends in a recoverable state, the current implementation at line 534 could mask unexpected state transitions. Consider using swapState with appropriate error handling or at least log at ERROR level when this forced transition occurs to aid debugging.

540-581: Well-implemented graceful stop handler.

The cmdStopUpstreamProcess method provides excellent flexibility:

Supports custom stop commands with PID substitution

Falls back to SIGTERM for standard cases

Good Windows support with default taskkill command

Proper error handling and logging

Consider adding a timeout for the custom stop command execution to prevent hanging if the stop command itself becomes unresponsive.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 16da4c3 and fddc14f.

📒 Files selected for processing (2)

proxy/process.go (9 hunks)
proxy/process_test.go (1 hunks)

🧰 Additional context used

🧬 Code Graph Analysis (1)

proxy/process_test.go (1)

proxy/process.go (1)

StateStopped (22-22)

⏰ Context from checks skipped due to timeout of 90000ms (1)

GitHub Check: run-tests

🔇 Additional comments (5)

proxy/process_test.go (1)

342-342: LGTM! State transition aligns with PR objectives.

The change from StateFailed to StateStopped is correct. This ensures that when an upstream process exits prematurely, it transitions to a recoverable state that allows automatic retry on the next request, which aligns with the PR's goal of improving reliability.

proxy/process.go (4)

46-78: Good refactoring approach with clear migration path.

The struct changes properly support the migration to exec.CommandContext. The temporary fields are clearly marked for removal, which helps maintain code clarity during the transition. The change from chan error to chan struct{} for cmdWaitChan is appropriate since it's now used purely for signaling command completion.

201-211: Excellent implementation of exec.CommandContext with custom cancellation.

The migration to exec.CommandContext is well-implemented:

Custom Cancel function allows graceful shutdown via cmdStopUpstreamProcess

WaitDelay provides time for cleanup before forceful termination

Creating a fresh cmdWaitChan for each start prevents channel reuse issues

Launching waitForCmd() as a separate goroutine properly handles command exit

This approach significantly simplifies the shutdown logic compared to manual signal handling.

Also applies to: 226-226

366-379: Excellent simplification of stopCommand().

The refactored method is much cleaner by leveraging exec.CommandContext:

Removes complex manual signal handling

Eliminates timeout management (now handled by cmd.WaitDelay)

Good defensive programming with the nil check

Clear and concise implementation

This is a significant improvement in code maintainability.

257-259: Critical change for automatic retry functionality.

This logic correctly handles the case where the upstream process exits cleanly during health check. By checking for StateStopped and returning a descriptive error, the system can automatically retry the process on the next request instead of permanently failing. This is a key improvement for reliability.

coderabbitai

Actionable comments posted: 3

♻️ Duplicate comments (2)

proxy/process.go (2)
248-254: Inconsistent with PR objectives and past review feedback.

Two issues here:

The different handling for StateStopped vs other states contradicts the PR objective of always retrying upstream processes

The error message "health check interrupted due to shutdown" was flagged in past reviews as misleading
	currentState := p.CurrentState()
	if currentState != StateStarting {
-		if currentState == StateStopped {
-			return fmt.Errorf("upstream command exited prematurely but successfully")
-		}
-		return errors.New("health check interrupted due to shutdown")
+		return fmt.Errorf("health check interrupted: process state changed to %s", currentState)
	}
207-216: Part of StateFailed inconsistency issue.

This error handling transitions to StateFailed when command start fails, which contradicts the PR objective of removing StateFailed to ensure automatic retries.

Consider updating this to transition to StateStopped instead, allowing for automatic retry on the next request.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fddc14f and 8f68dd6.

📒 Files selected for processing (1)

proxy/process.go (9 hunks)

⏰ Context from checks skipped due to timeout of 90000ms (1)

GitHub Check: run-tests

🔇 Additional comments (4)

proxy/process.go (4)

46-50: LGTM! Struct field changes align with new cancellation approach.

The addition of cancelUpstream and conversion of cmdWaitChan to a simple notification channel properly supports the new exec.CommandContext pattern.

Also applies to: 85-85, 98-99

28-28: ⚠️ Potential issue

Critical inconsistency with PR objectives.

The PR objectives state that StateFailed should be removed so upstream processes are always retried automatically, but the code still references StateFailed in multiple locations. This contradicts the reliability improvements intended by this PR.

Consider removing all StateFailed references and updating the logic to retry failed processes automatically:
const (
	StateStopped  ProcessState = ProcessState("stopped")
	StateStarting ProcessState = ProcessState("starting")
	StateReady    ProcessState = ProcessState("ready")
	StateStopping ProcessState = ProcessState("stopping")
-
-	// failed a health check on start and will not be recovered
-	StateFailed ProcessState = ProcessState("failed")

	// process is shutdown and will not be restarted
	StateShutdown ProcessState = ProcessState("shutdown")
)
Update state transition logic and error handling to remove StateFailed transitions and ensure automatic retry behavior.

Also applies to: 140-140, 146-146, 209-209, 328-328, 404-404

Likely an incorrect or invalid review comment.

1-567: Good technical approach but critical alignment issues with PR objectives.

The refactor successfully implements the exec.CommandContext pattern and simplifies the stop logic as intended. However, the retention of StateFailed throughout the codebase directly contradicts the PR objectives of ensuring upstream processes are always retried automatically.

The key goals of this PR are not fully met. Consider completing the StateFailed removal to achieve the reliability improvements described in the PR objectives.

524-566:
✅ Verification successful

LGTM! Well-implemented graceful stop mechanism.

The method properly handles both custom stop commands and default SIGTERM, with appropriate Windows support. The PID substitution and command sanitization appear secure.

Verify that the SanitizeCommand function properly handles the PID substitution to prevent command injection:

🏁 Script executed:
#!/bin/bash
# Description: Check SanitizeCommand implementation for security
# Expected: Should find proper input validation and sanitization

ast-grep --pattern 'func SanitizeCommand($$$) {
  $$$
}'
Length of output: 58

🏁 Script executed:
#!/bin/bash
# Locate and display the SanitizeCommand implementation for review
rg -n "SanitizeCommand" -C 5 .
Length of output: 5131

🏁 Script executed:
#!/bin/bash
# Display the SanitizeCommand implementation for security review
rg -n "func SanitizeCommand" -A 50 proxy/config.go
Length of output: 1044
Safe PID Substitution Confirmed

SanitizeCommand simply parses the command string into a slice of arguments (handling quotes, comments, backslashes) without invoking a shell, and ${PID} is replaced via fmt.Sprintf("%d", pid), ensuring it’s strictly numeric. Since we call exec.Command(args[0], args[1:]...), there is no shell interpolation or injection risk from the PID value.

coderabbitai · 2025-06-05T22:38:19Z

proxy/process.go

+func (p *Process) stopCommand() {
 	stopStartTime := time.Now()
 	defer func() {
 		p.proxyLogger.Debugf("<%s> stopCommand took %v", p.ID, time.Since(stopStartTime))
 	}()

-	sigtermTimeout, cancelTimeout := context.WithTimeout(context.Background(), sigtermTTL)
-	defer cancelTimeout()
-
-	if p.cmd == nil || p.cmd.Process == nil {
-		p.proxyLogger.Debugf("<%s> cmd or cmd.Process is nil (normal during config reload)", p.ID)
+	if p.cancelUpstream == nil {
+		p.proxyLogger.Errorf("<%s> stopCommand has a nil p.cancelUpstream()", p.ID)
 		return
 	}

-	// if err := p.terminateProcess(); err != nil {
-	// 	p.proxyLogger.Debugf("<%s> Process already terminated: %v (normal during shutdown)", p.ID, err)
-	// }
-	// the default cmdStop to taskkill /f /t /pid ${PID}
-	if runtime.GOOS == "windows" && strings.TrimSpace(p.config.CmdStop) == "" {
-		p.config.CmdStop = "taskkill /f /t /pid ${PID}"
-	}
-
-	if p.config.CmdStop != "" {
-		// replace ${PID} with the pid of the process
-		stopArgs, err := SanitizeCommand(strings.ReplaceAll(p.config.CmdStop, "${PID}", fmt.Sprintf("%d", p.cmd.Process.Pid)))
-		if err != nil {
-			p.proxyLogger.Errorf("<%s> Failed to sanitize stop command: %v", p.ID, err)
-			return
-		}
-
-		p.proxyLogger.Debugf("<%s> Executing stop command: %s", p.ID, strings.Join(stopArgs, " "))
-
-		stopCmd := exec.Command(stopArgs[0], stopArgs[1:]...)
-		stopCmd.Stdout = p.processLogger
-		stopCmd.Stderr = p.processLogger
-		stopCmd.Env = p.config.Env
-
-		if err := stopCmd.Run(); err != nil {
-			p.proxyLogger.Errorf("<%s> Failed to exec stop command: %v", p.ID, err)
-			return
-		}
-	} else {
-		if err := p.cmd.Process.Signal(syscall.SIGTERM); err != nil {
-			p.proxyLogger.Errorf("<%s> Failed to send SIGTERM to process: %v", p.ID, err)
-			return
-		}
-	}
-
-	select {
-	case <-sigtermTimeout.Done():
-		p.proxyLogger.Debugf("<%s> Process timed out waiting to stop, sending KILL signal (normal during shutdown)", p.ID)
-		p.upstreamWasStoppedWithKill = true
-		if err := p.cmd.Process.Kill(); err != nil {
-			p.proxyLogger.Errorf("<%s> Failed to kill process: %v", p.ID, err)
-		}
-	case err := <-p.cmdWaitChan:
-		// Note: in start(), p.cmdWaitChan also has a select { ... }. That should be OK
-		// because if we make it here then the cmd has been successfully running and made it
-		// through the health check. There is a possibility that the cmd crashed after the health check
-		// succeeded but that's not a case llama-swap is handling for now.
-		if err != nil {
-			if errno, ok := err.(syscall.Errno); ok {
-				p.proxyLogger.Errorf("<%s> errno >> %v", p.ID, errno)
-			} else if exitError, ok := err.(*exec.ExitError); ok {
-				if strings.Contains(exitError.String(), "signal: terminated") {
-					p.proxyLogger.Debugf("<%s> Process stopped OK", p.ID)
-				} else if strings.Contains(exitError.String(), "signal: interrupt") {
-					p.proxyLogger.Debugf("<%s> Process interrupted OK", p.ID)
-				} else {
-					p.proxyLogger.Warnf("<%s> ExitError >> %v, exit code: %d", p.ID, exitError, exitError.ExitCode())
-				}
-			} else {
-				p.proxyLogger.Errorf("<%s> Process exited >> %v", p.ID, err)
-			}
-		}
-	}
+	p.cancelUpstream()
+	<-p.cmdWaitChan
 }


🛠️ Refactor suggestion

Potential indefinite blocking in stopCommand.

The method waits on cmdWaitChan without a timeout, which could block indefinitely if the command doesn't exit properly. Consider adding a timeout mechanism.

func (p *Process) stopCommand() { stopStartTime := time.Now() defer func() { p.proxyLogger.Debugf("<%s> stopCommand took %v", p.ID, time.Since(stopStartTime)) }() if p.cancelUpstream == nil { p.proxyLogger.Errorf("<%s> stopCommand has a nil p.cancelUpstream()", p.ID) return } p.cancelUpstream() - <-p.cmdWaitChan + select { + case <-p.cmdWaitChan: + // Command exited normally + case <-time.After(p.gracefulStopTimeout + 5*time.Second): + p.proxyLogger.Warnf("<%s> stopCommand timed out waiting for command to exit", p.ID) + } }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

func (p *Process) stopCommand() {

stopStartTime := time.Now()

defer func() {

p.proxyLogger.Debugf("<%s> stopCommand took %v", p.ID, time.Since(stopStartTime))

}()

sigtermTimeout, cancelTimeout := context.WithTimeout(context.Background(), sigtermTTL)

defer cancelTimeout()

if p.cmd == nil || p.cmd.Process == nil {

p.proxyLogger.Debugf("<%s> cmd or cmd.Process is nil (normal during config reload)", p.ID)

if p.cancelUpstream == nil {

p.proxyLogger.Errorf("<%s> stopCommand has a nil p.cancelUpstream()", p.ID)

return

}

// if err := p.terminateProcess(); err != nil {

// p.proxyLogger.Debugf("<%s> Process already terminated: %v (normal during shutdown)", p.ID, err)

// }

// the default cmdStop to taskkill /f /t /pid ${PID}

if runtime.GOOS == "windows" && strings.TrimSpace(p.config.CmdStop) == "" {

p.config.CmdStop = "taskkill /f /t /pid ${PID}"

}

if p.config.CmdStop != "" {

// replace ${PID} with the pid of the process

stopArgs, err := SanitizeCommand(strings.ReplaceAll(p.config.CmdStop, "${PID}", fmt.Sprintf("%d", p.cmd.Process.Pid)))

if err != nil {

p.proxyLogger.Errorf("<%s> Failed to sanitize stop command: %v", p.ID, err)

return

}

p.proxyLogger.Debugf("<%s> Executing stop command: %s", p.ID, strings.Join(stopArgs, " "))

stopCmd := exec.Command(stopArgs[0], stopArgs[1:]...)

stopCmd.Stdout = p.processLogger

stopCmd.Stderr = p.processLogger

stopCmd.Env = p.config.Env

if err := stopCmd.Run(); err != nil {

p.proxyLogger.Errorf("<%s> Failed to exec stop command: %v", p.ID, err)

return

}

} else {

if err := p.cmd.Process.Signal(syscall.SIGTERM); err != nil {

p.proxyLogger.Errorf("<%s> Failed to send SIGTERM to process: %v", p.ID, err)

return

}

}

select {

case <-sigtermTimeout.Done():

p.proxyLogger.Debugf("<%s> Process timed out waiting to stop, sending KILL signal (normal during shutdown)", p.ID)

p.upstreamWasStoppedWithKill = true

if err := p.cmd.Process.Kill(); err != nil {

p.proxyLogger.Errorf("<%s> Failed to kill process: %v", p.ID, err)

}

case err := <-p.cmdWaitChan:

// Note: in start(), p.cmdWaitChan also has a select { ... }. That should be OK

// because if we make it here then the cmd has been successfully running and made it

// through the health check. There is a possibility that the cmd crashed after the health check

// succeeded but that's not a case llama-swap is handling for now.

if err != nil {

if errno, ok := err.(syscall.Errno); ok {

p.proxyLogger.Errorf("<%s> errno >> %v", p.ID, errno)

} else if exitError, ok := err.(*exec.ExitError); ok {

if strings.Contains(exitError.String(), "signal: terminated") {

p.proxyLogger.Debugf("<%s> Process stopped OK", p.ID)

} else if strings.Contains(exitError.String(), "signal: interrupt") {

p.proxyLogger.Debugf("<%s> Process interrupted OK", p.ID)

} else {

p.proxyLogger.Warnf("<%s> ExitError >> %v, exit code: %d", p.ID, exitError, exitError.ExitCode())

}

} else {

p.proxyLogger.Errorf("<%s> Process exited >> %v", p.ID, err)

}

}

}

p.cancelUpstream()

<-p.cmdWaitChan

}

func (p *Process) stopCommand() {

stopStartTime := time.Now()

defer func() {

p.proxyLogger.Debugf("<%s> stopCommand took %v", p.ID, time.Since(stopStartTime))

}()

if p.cancelUpstream == nil {

p.proxyLogger.Errorf("<%s> stopCommand has a nil p.cancelUpstream()", p.ID)

return

}

p.cancelUpstream()

select {

case <-p.cmdWaitChan:

// Command exited normally

case <-time.After(p.gracefulStopTimeout + 5*time.Second):

p.proxyLogger.Warnf("<%s> stopCommand timed out waiting for command to exit", p.ID)

}

}

🤖 Prompt for AI Agents

In proxy/process.go around lines 359 to 372, the stopCommand method waits indefinitely on cmdWaitChan, risking a deadlock if the command never exits. Modify the code to use a select statement that waits on cmdWaitChan with a timeout case, such as using time.After with a reasonable duration, to prevent indefinite blocking and allow the function to proceed or log an error if the timeout occurs.

no this is fine as the p.cmdWaitChan is closed, it's just a signal channel.

@mostlygeek You're absolutely right! Thank you for the clarification. I see now that cmdWaitChan is indeed a signal channel that gets closed by waitForCmd() when cmd.Wait() returns. Reading from a closed channel returns immediately, so <-p.cmdWaitChan won't block indefinitely - it will unblock as soon as the command exits and the channel is closed.

I appreciate you pointing out this signal channel pattern. My suggestion was based on a misunderstanding of how the channel was being used.

(\ /) ( ._.) o_(")(")

learned something new

✏️ Learnings added

Learnt from: mostlygeek PR: mostlygeek/llama-swap#155 File: proxy/process.go:359-372 Timestamp: 2025-06-05T22:57:53.766Z Learning: In Go, cmdWaitChan in proxy/process.go is a signal channel (chan struct{}) that follows the pattern where waitForCmd() closes the channel when cmd.Wait() returns, and stopCommand() waits on it. Reading from a closed channel returns immediately, so <-cmdWaitChan won't block indefinitely.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

coderabbitai · 2025-06-05T22:38:19Z

proxy/process.go

+// waitForCmd waits for the command to exit and handles exit conditions depending on current state
+func (p *Process) waitForCmd() {
+	exitErr := p.cmd.Wait()
+	p.proxyLogger.Debugf("<%s> cmd.Wait() returned error: %v", p.ID, exitErr)
+
+	if exitErr != nil {
+		if errno, ok := exitErr.(syscall.Errno); ok {
+			p.proxyLogger.Errorf("<%s> errno >> %v", p.ID, errno)
+		} else if exitError, ok := exitErr.(*exec.ExitError); ok {
+			if strings.Contains(exitError.String(), "signal: terminated") {
+				p.proxyLogger.Debugf("<%s> Process stopped OK", p.ID)
+			} else if strings.Contains(exitError.String(), "signal: interrupt") {
+				p.proxyLogger.Debugf("<%s> Process interrupted OK", p.ID)
+			} else {
+				p.proxyLogger.Warnf("<%s> ExitError >> %v, exit code: %d", p.ID, exitError, exitError.ExitCode())
+			}
+		} else {
+			if exitErr.Error() != "context canceled" /* this is normal */ {
+				p.proxyLogger.Errorf("<%s> Process exited >> %v", p.ID, exitErr)
+			}
+		}
+	}
+
+	currentState := p.CurrentState()
+	switch currentState {
+	case StateStopping:
+		if curState, err := p.swapState(StateStopping, StateStopped); err != nil {
+			p.proxyLogger.Errorf("<%s> Process exited but could not swap to StateStopped. curState=%s, err: %v", p.ID, curState, err)
+			p.state = StateStopped
+		}
+	default:
+		p.proxyLogger.Infof("<%s> process exited but not StateStopping, current state: %s", p.ID, currentState)
+		p.state = StateStopped // force it to be in this state
+	}
+	close(p.cmdWaitChan)
+}


⚠️ Potential issue

Concurrency safety violation in state management.

Lines 515 and 519 directly assign to p.state without proper locking, breaking the concurrency safety pattern used elsewhere in the code.

switch currentState { case StateStopping: if curState, err := p.swapState(StateStopping, StateStopped); err != nil { p.proxyLogger.Errorf("<%s> Process exited but could not swap to StateStopped. curState=%s, err: %v", p.ID, curState, err) - p.state = StateStopped + // Force state transition with proper locking + p.stateMutex.Lock() + p.state = StateStopped + p.stateMutex.Unlock() } default: p.proxyLogger.Infof("<%s> process exited but not StateStopping, current state: %s", p.ID, currentState) - p.state = StateStopped // force it to be in this state + // Force state transition with proper locking + p.stateMutex.Lock() + p.state = StateStopped + p.stateMutex.Unlock() }

🤖 Prompt for AI Agents

In proxy/process.go around lines 515 and 519, the code directly assigns to p.state without using locks, violating concurrency safety. Replace these direct assignments with a proper state swap method that handles locking, such as using p.swapState or another thread-safe mechanism consistent with the rest of the code to update p.state safely.

coderabbitai · 2025-06-05T22:38:19Z

proxy/process.go

+	cmdContext, ctxCancelUpstream := context.WithCancel(context.Background())
+	p.cmd = exec.CommandContext(cmdContext, args[0], args[1:]...)
 	p.cmd.Stdout = p.processLogger
 	p.cmd.Stderr = p.processLogger
 	p.cmd.Env = p.config.Env

+	p.cmd.Cancel = p.cmdStopUpstreamProcess
+	p.cmd.WaitDelay = p.gracefulStopTimeout
+	p.cancelUpstream = ctxCancelUpstream
+	p.cmdWaitChan = make(chan struct{})
+


🛠️ Refactor suggestion

Potential race condition with channel recreation.

Creating a new cmdWaitChan at line 203 could cause issues if other goroutines are waiting on the previous channel. Consider moving this to the beginning of the method or ensuring proper synchronization.

p.waitStarting.Add(1) defer p.waitStarting.Done() + p.cmdWaitChan = make(chan struct{}) cmdContext, ctxCancelUpstream := context.WithCancel(context.Background()) p.cmd = exec.CommandContext(cmdContext, args[0], args[1:]...) p.cmd.Stdout = p.processLogger p.cmd.Stderr = p.processLogger p.cmd.Env = p.config.Env p.cmd.Cancel = p.cmdStopUpstreamProcess p.cmd.WaitDelay = p.gracefulStopTimeout p.cancelUpstream = ctxCancelUpstream - p.cmdWaitChan = make(chan struct{})

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents

In proxy/process.go around lines 194 to 204, the creation of the cmdWaitChan channel at line 203 may cause a race condition if other goroutines are still waiting on the previous channel. To fix this, move the initialization of cmdWaitChan to the start of the method before any goroutines might access it, or add synchronization mechanisms such as mutexes to ensure no concurrent access occurs during channel recreation.

- removed StateFailed - StateStarting --> StartStopped is OK - updated tests

mostlygeek · 2025-06-06T20:55:52Z

Nice! llama-server crashed and llama-swap automatically restarted it:

llama-swap detecting the crash/exit and restarting it:

(everything is normal ...) 

[INFO] Request 10.0.1.135 "POST /upstream/qwen-coder-14B-FIM/infill HTTP/1.1" 200 15667 "axios/1.8.4" 190.311173ms
[DEBUG] Exclusive mode for group SUPDev, stopping other process groups
[DEBUG] <qwen-coder-14B-FIM> request /upstream/qwen-coder-14B-FIM/infill - start: 0s, total: 123.181511ms
[INFO] Request 10.0.1.135 "POST /upstream/qwen-coder-14B-FIM/infill HTTP/1.1" 200 15656 "axios/1.8.4" 123.315148ms
[DEBUG] Exclusive mode for group SUPDev, stopping other process groups
[DEBUG] <qwen-coder-14B-FIM> request /upstream/qwen-coder-14B-FIM/infill - start: 0s, total: 111.109629ms
[INFO] Request 10.0.1.135 "POST /upstream/qwen-coder-14B-FIM/infill HTTP/1.1" 200 14893 "axios/1.8.4" 111.279777ms

(llama-server crashes ... )

[DEBUG] Exclusive mode for group SUPDev, stopping other process groups
[INFO] Request 10.0.1.135 "POST /upstream/qwen-coder-14B-FIM/infill HTTP/1.1" 502 41 "axios/1.8.4" 293.769421ms
[DEBUG] Exclusive mode for group SUPDev, stopping other process groups
[INFO] Request 10.0.1.135 "POST /upstream/qwen-coder-14B-FIM/infill HTTP/1.1" 502 90 "axios/1.8.4" 492.078µs
[DEBUG] <qwen-coder-14B-FIM> cmd.Wait() returned error: signal: aborted (core dumped)
[WARN] <qwen-coder-14B-FIM> ExitError >> signal: aborted (core dumped), exit code: -1
[INFO] <qwen-coder-14B-FIM> process exited but not StateStopping, current state: ready

(llama-server is restarted, previously would require a llama-swap restart! )

[DEBUG] Exclusive mode for group SUPDev, stopping other process groups
[DEBUG] <qwen-coder-14B-FIM> swapState() State transitioned from stopped to starting
[DEBUG] <qwen-coder-14B-FIM> Health check error on http://localhost:5815/health, status code: 503 (normal during startup)
[INFO] <qwen-coder-14B-FIM> Health check passed on http://localhost:5815/health
[DEBUG] <qwen-coder-14B-FIM> swapState() State transitioned from starting to ready
[DEBUG] <qwen-coder-14B-FIM> request /upstream/qwen-coder-14B-FIM/infill - start: 5.255732227s, total: 6.851577272s
[INFO] Request 10.0.1.135 "POST /upstream/qwen-coder-14B-FIM/infill HTTP/1.1" 200 15831 "axios/1.8.4" 6.851691632s

llama-server crash logs

srv  update_slots: all slots are idle
srv  log_server_r: request: POST /infill 127.0.0.1 200
slot launch_slot_: id  0 | task 1513 | processing task
slot update_slots: id  0 | task 1513 | new prompt, n_ctx_slot = 4096, n_keep = 0, n_prompt_tokens = 3077
slot update_slots: id  0 | task 1513 | kv cache rm [2606, end)
slot update_slots: id  0 | task 1513 | prompt processing progress, n_past = 3077, n_tokens = 471, progress = 0.153071
slot update_slots: id  0 | task 1513 | prompt done, n_past = 3077, n_tokens = 471
/home/mostlygeek/llama.cpp/src/llama-kv-cache-unified.cpp:1188: GGML_ASSERT(nf == nh && "KV defrag bug: nf != nh") failed
/path/to/llama-server/llama-server-latest(+0x75683b)[0x5a59595d683b]
/path/to/llama-server/llama-server-latest(+0x756dfc)[0x5a59595d6dfc]
/path/to/llama-server/llama-server-latest(+0x756f51)[0x5a59595d6f51]
/path/to/llama-server/llama-server-latest(+0x38b79d)[0x5a595920b79d]
/path/to/llama-server/llama-server-latest(+0x38bb64)[0x5a595920bb64]
/path/to/llama-server/llama-server-latest(+0x2c2bf0)[0x5a5959142bf0]
/path/to/llama-server/llama-server-latest(+0x2c7b3d)[0x5a5959147b3d]
/path/to/llama-server/llama-server-latest(+0x2c933f)[0x5a595914933f]
/path/to/llama-server/llama-server-latest(+0xfcde2)[0x5a5958f7cde2]
/path/to/llama-server/llama-server-latest(+0xc460c)[0x5a5958f4460c]
/path/to/llama-server/llama-server-latest(+0x5eee0)[0x5a5958edeee0]
/lib/x86_64-linux-gnu/libc.so.6(+0x2a1ca)[0x737f6082a1ca]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x8b)[0x737f6082a28b]
/path/to/llama-server/llama-server-latest(+0x8ee45)[0x5a5958f0ee45]
ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 CUDA devices:

mostlygeek added 2 commits June 5, 2025 10:13

add experiment to test cmd.CommandContext behaviour

4d16f76

Improve healthCheck interruption logic

c699aab

- remove Process.shutdownCtx, Process.shutdownCancel - simplify logic by checking Process.CurrentState() instead of more complex channel management

mostlygeek self-assigned this Jun 5, 2025

mostlygeek added the enhancement New feature or request label Jun 5, 2025

coderabbitai bot reviewed Jun 5, 2025

View reviewed changes

mostlygeek added 2 commits June 5, 2025 11:30

move cmd.Wait logic out of Process.start()

1e16c83

Refactor Process stop logic to use Cancel context

16da4c3

coderabbitai bot reviewed Jun 5, 2025

View reviewed changes

Refactor Process.start() health check logic

fddc14f

coderabbitai bot reviewed Jun 5, 2025

View reviewed changes

noop, remove some comments

8f68dd6

coderabbitai bot reviewed Jun 5, 2025

View reviewed changes

Revise Process lifecycle state machine

df36c41

- removed StateFailed - StateStarting --> StartStopped is OK - updated tests

mostlygeek merged commit 2dc0ca0 into main Jun 5, 2025
3 checks passed

mostlygeek deleted the process-cmd-test branch June 5, 2025 23:25

coderabbitai bot mentioned this pull request Sep 25, 2025

Add unload single model functionality #318

Merged

coderabbitai bot mentioned this pull request Oct 13, 2025

Fix race conditions in proxy.Process #349

Merged

This was referenced Oct 26, 2025

proxy: add panic recovery to Process.ProxyRequest #363

Merged

feat: add optional -enrich-cot flag to stream swap progress in reasoning_content #370

Closed

llama-swap should never need to be restarted due to upstream issues #155

llama-swap should never need to be restarted due to upstream issues #155

Uh oh!

Conversation

mostlygeek commented Jun 5, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jun 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Possibly related PRs

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

Documentation and Community

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Jun 5, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jun 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mostlygeek Jun 5, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jun 5, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jun 5, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jun 5, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mostlygeek commented Jun 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

mostlygeek commented Jun 5, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jun 5, 2025 •

edited

Loading

coderabbitai bot Jun 5, 2025 •

edited

Loading

mostlygeek commented Jun 6, 2025 •

edited

Loading