|
9 | 9 | //! |
10 | 10 | //! DNS caching is configured with reasonable TTL to allow failover and load |
11 | 11 | //! balancer updates (#2177). |
| 12 | +//! |
| 13 | +//! # Timeout Configuration Guide |
| 14 | +//! |
| 15 | +//! This section documents the timeout hierarchy across the Cortex codebase. Use this |
| 16 | +//! as a reference when configuring timeouts for new features or debugging timeout issues. |
| 17 | +//! |
| 18 | +//! ## Timeout Hierarchy |
| 19 | +//! |
| 20 | +//! | Use Case | Timeout | Constant/Location | Rationale | |
| 21 | +//! |-----------------------------|---------|--------------------------------------------|-----------------------------------------| |
| 22 | +//! | Health checks | 5s | `HEALTH_CHECK_TIMEOUT` (this module) | Quick validation of service status | |
| 23 | +//! | Standard HTTP requests | 30s | `DEFAULT_TIMEOUT` (this module) | Normal API calls with reasonable margin | |
| 24 | +//! | Per-chunk read (streaming) | 30s | `read_timeout` (cortex-app-server/config) | Individual chunk timeout during stream | |
| 25 | +//! | Pool idle timeout | 60s | `POOL_IDLE_TIMEOUT` (this module) | DNS re-resolution for failover | |
| 26 | +//! | LLM Request (non-streaming) | 120s | `DEFAULT_REQUEST_TIMEOUT_SECS` (cortex-exec/runner) | Model inference takes time | |
| 27 | +//! | LLM Streaming total | 300s | `STREAMING_TIMEOUT` (this module) | Long-running streaming responses | |
| 28 | +//! | Server request lifecycle | 300s | `request_timeout` (cortex-app-server/config) | Full HTTP request/response cycle | |
| 29 | +//! | Entire exec session | 600s | `DEFAULT_TIMEOUT_SECS` (cortex-exec/runner) | Multi-turn conversation limit | |
| 30 | +//! | Graceful shutdown | 30s | `shutdown_timeout` (cortex-app-server/config) | Time for cleanup on shutdown | |
| 31 | +//! |
| 32 | +//! ## Module-Specific Timeouts |
| 33 | +//! |
| 34 | +//! ### cortex-common (this module) |
| 35 | +//! - `DEFAULT_TIMEOUT` (30s): Use for standard API calls. |
| 36 | +//! - `STREAMING_TIMEOUT` (300s): Use for LLM streaming endpoints. |
| 37 | +//! - `HEALTH_CHECK_TIMEOUT` (5s): Use for health/readiness checks. |
| 38 | +//! - `POOL_IDLE_TIMEOUT` (60s): Connection pool cleanup for DNS freshness. |
| 39 | +//! |
| 40 | +//! ### cortex-exec (runner.rs) |
| 41 | +//! - `DEFAULT_TIMEOUT_SECS` (600s): Maximum duration for entire exec session. |
| 42 | +//! - `DEFAULT_REQUEST_TIMEOUT_SECS` (120s): Single LLM request timeout. |
| 43 | +//! |
| 44 | +//! ### cortex-app-server (config.rs) |
| 45 | +//! - `request_timeout` (300s): Full request lifecycle timeout. |
| 46 | +//! - `read_timeout` (30s): Per-chunk timeout for streaming reads. |
| 47 | +//! - `shutdown_timeout` (30s): Graceful shutdown duration. |
| 48 | +//! |
| 49 | +//! ### cortex-engine (api_client.rs) |
| 50 | +//! - Re-exports constants from this module for consistency. |
| 51 | +//! |
| 52 | +//! ## Recommendations |
| 53 | +//! |
| 54 | +//! When adding new timeout configurations: |
| 55 | +//! 1. Use constants from this module when possible for consistency. |
| 56 | +//! 2. Document any new timeout constants with their rationale. |
| 57 | +//! 3. Consider the timeout hierarchy - inner timeouts should be shorter than outer ones. |
| 58 | +//! 4. For LLM operations, use longer timeouts (120s-300s) to accommodate model inference. |
| 59 | +//! 5. For health checks and quick validations, use short timeouts (5s-10s). |
12 | 60 |
|
13 | 61 | use reqwest::Client; |
14 | 62 | use std::time::Duration; |
|
0 commit comments