Skip to content

Deadlock when cancelling a streaming response #362

@KonstantinGeist

Description

@KonstantinGeist

Describe the bug

Repro:

  1. model A is swapped in and starts streaming
  2. we cancel it midway
  3. we attempt to call model B in the same process group
  4. deadlock

The issue seems to be in ProcessGroup.ProxyRequest:

  1. it takes pg.Lock()
  2. this line actually panics on cancel from the client:
pg.processes[modelID].ProxyRequest(writer, request)

The panic is triggered by p.reverseProxy.ServeHTTP(w, r) inside it, because the proxy doesn't have any other way to report the error, and has to generate a panic.

  1. the stack unwinds on panic and our pg.Lock is never released
  2. next calls to ProxyRequest deadlock because they try to lock a mutex which was never released

The fix seems to be pretty simple, use defer:

		if pg.lastUsedProcess != modelID {
			// ensure unlock even if ProxyRequest panics
			defer pg.Unlock() // <-- add here

			// is there something already running?
			if pg.lastUsedProcess != "" {
				pg.processes[pg.lastUsedProcess].Stop()
			}

			// wait for the request to the new model to be fully handled
			// and prevent race conditions see issue #277
			pg.processes[modelID].ProxyRequest(writer, request)
			pg.lastUsedProcess = modelID

			// short circuit and exit
			// pg.Unlock() <-- remove this
			return nil
		}

I suspect the problem was introduced by the recent switch to the standard proxy which has this behavior of panicking when the request is cancelled. In Go's source code (net/http/httputil/reverseproxy.go):

		// Since we're streaming the response, if we run into an error all we can do
		// is abort the request. Issue 23643: ReverseProxy should use ErrAbortHandler
		// on read error while copying body.
		if !shouldPanicOnCopyError(req) {
			p.logf("suppressing panic for copyResponse error in test; copy error: %v", err)
			return
		}
		panic(http.ErrAbortHandler)

After the quick fix above, the issue disappeared on my server.

Expected behaviour
Llama-swap doesn't hang.

Operating system and version

Ubuntu 24.04.3 LTS

My Configuration

It's pretty large, the relevant bit is this:

groups:
  "main":
    swap: true
    exclusive: false
    members:
    - "qwen3-32b-q4-32k"
    - "qwen2.5-vl-32b-q4"

Upstream Logs

When the deadlock happens, nothing appears in the logs.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions