-
Notifications
You must be signed in to change notification settings - Fork 5.1k
Description
[Bug] Preview proxy returns 400 "bad request: 404 Not Found" after sandbox idle period
Labels: bug, proxy, preview
Description
The Daytona preview proxy (*.proxy.daytona.works) intermittently returns HTTP 400 errors to browsers after a tab has been idle for 5+ seconds. The error body is:
{"statusCode": 400, "message": "bad request: 404 Not Found", "code": "BAD_REQUEST"}Reproduction Steps
- Create a sandbox with a Vite dev server on port 5173
- Get a preview URL (signed or regular) and load it in a browser
- Leave the tab idle for 5–30+ seconds
- Return to the tab — page resources fail with 400 errors in the browser console:
GET /@vite/client 400 (Bad Request) GET /@react-refresh 400 (Bad Request) GET /src/App.tsx 400 (Bad Request) - Manual page refresh immediately fixes it
Connection Architecture
Browser ──HTTP/2──> Daytona Cloud Proxy (*.proxy.daytona.works)
│
│ persistent TCP connection
▼
Daytona Daemon inside sandbox (:2280)
│
│ HTTP/1.1 pooled connection
▼
App server (e.g. Vite :5173)
Root Cause
Node.js's default HTTP server keepAliveTimeout is 5 seconds. After 5s of inactivity, Vite (or any Node.js-based dev server) closes the pooled connection on its side.
When the browser returns from idle:
- The Daytona daemon still holds a reference to the now-dead connection in its pool
- The daemon attempts to reuse the stale connection for the incoming request
- The backend (Vite) has already closed it → daemon receives a connection-reset / EOF error
- The proxy wraps this as
400 {"message": "bad request: 404 Not Found"}
Evidence
- Vite response header confirms:
keep-alive: timeout=5 ss -tnpinside the sandbox shows the daemon (PID 1) maintains an HTTP/1.1 connection to the backend- That connection disappears from
ss -tnpafter exactly 5–8 seconds of no traffic - The
/health-coderendpoint on the same Vite port returns 200 — the server IS running; the issue is the stale pooled connection, not the server being down - The error is intermittent because it is time-dependent: any request within 5s of the last one succeeds
- Two load-balanced proxy IPs were observed (
100.52.152.155,35.175.80.172), each maintaining its own persistent connection to the sandbox daemon — this explains why the failure is not 100% reproducible across reloads
Workaround (Client-Side)
Extend Node.js HTTP server keepAliveTimeout in the Vite config so the backend never closes the connection before the daemon's pool TTL expires.
vite.config.ts
import { defineConfig, type Plugin } from 'vite'
import react from '@vitejs/plugin-react'
function extendKeepAlivePlugin(): Plugin {
return {
name: 'extend-keep-alive',
configureServer(server) {
const apply = () => {
if (server.httpServer) {
// Default Node.js keepAliveTimeout is 5s. The Daytona daemon pools
// connections to the backend and may reuse one after Vite has already
// closed it, resulting in 400 errors. Extending this timeout prevents
// Vite from closing the connection before the daemon's pool TTL.
server.httpServer.keepAliveTimeout = 120_000 // 2 minutes
server.httpServer.headersTimeout = 121_000 // must be > keepAliveTimeout
}
}
// httpServer may not be bound yet at configureServer time; hook both
apply()
server.httpServer?.on('listening', apply)
}
}
}
export default defineConfig({
plugins: [react(), extendKeepAlivePlugin()],
})Verified: with this fix applied, all requests succeed after 60–120s of idle. ss -tnp confirms the daemon reuses the same pooled connection (same source port) rather than hitting a dead one.
Requested Fix (Proxy / Daemon Side)
The daemon should handle stale connection reuse transparently so users do not need to configure their applications around proxy internals.
| Option | Description |
|---|---|
| A — Validate before reuse | Probe the pooled connection before sending a real request. If dead, open a new one and retry transparently. |
| B — Shorten pool TTL | Evict pooled connections from the daemon pool after < 5s (before the backend closes them), so the daemon always opens a fresh connection. |
| C — Retry on connection-reset | If the backend returns a connection-reset / EOF error, automatically retry on a new connection before surfacing an error to the client. |
Note: Returning
400to the client for a backend connection-reset is semantically incorrect. A502 Bad Gatewaywould at minimum be more accurate; a silent retry is the correct behavior.
Some dicussion points
- What is the pool TTL configured in the daemon for backend connections?
- Does the daemon implement stale connection detection (TCP keepalive probes or a health check before reuse)?
- Which component generates the
400 {"message": "bad request: 404 Not Found"}response — the cloud proxy or the daemon? - Is the pool TTL configurable per-sandbox or globally?
Environment
| Daytona daemon | v0.143.0-prod |
| Vite | 7.3.1 |
| Proxy | *.proxy.daytona.works (2 load-balanced IPs) |
| Sandbox runtime | Node.js via NVM |