Fix: Add per-deployment Bearer auth to block rogue evaluation requests#68
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughPer-evaluation bearer-token auth was added: the client generates a fresh token, injects it into Basilica deployments via EVAL_AUTH_TOKEN, sends it as Authorization headers on POST /evaluate and GET /eval-status/{job_id}, and the FastAPI service validates tokens and tracks background evaluation tasks. Changes
Sequence DiagramsequenceDiagram
participant Runner as Runner/Tester
participant Deployment as Basilica Deployment
participant Server as FastAPI Env
participant Queue as Background Job Queue
Runner->>Runner: Generate auth_token\n(secrets.token_urlsafe(32))
Runner->>Deployment: Deploy with env\nEVAL_AUTH_TOKEN=auth_token
Runner->>Runner: Create httpx.AsyncClient\nwith Authorization header
Runner->>Server: POST /evaluate\nAuthorization: Bearer {auth_token}
Server->>Server: _verify_auth(request)\nvalidate token
alt Auth Valid
Server->>Queue: Enqueue background job
Server-->>Runner: 200 {job_id}
else Auth Invalid
Server-->>Runner: 401 {detail: "unauthorized"}
end
loop Polling
Runner->>Server: GET /eval-status/{job_id}\nAuthorization: Bearer {auth_token}
Server->>Server: _verify_auth(request)
alt Auth Valid
Server-->>Runner: 200 {status}
else Auth Invalid
Server-->>Runner: 401 {detail: "unauthorized"}
end
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@environments/templar/env.py`:
- Around line 3322-3331: The created background task from asyncio.create_task in
the endpoint is not stored and may be garbage-collected; declare a module-level
set named _background_tasks: set[asyncio.Task] (e.g., near other globals) and
replace the create_task call with task =
asyncio.create_task(_evaluation_background(job_id, body)), add the task to
_background_tasks, and attach a done callback that removes the task from
_background_tasks when finished (so the task is kept alive until completion but
not leaked). Update references around _evaluation_background, _jobs_lock and the
create_task invocation to use this pattern.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 569351c7-569b-48a6-b28d-949a4d0c0bc8
📒 Files selected for processing (3)
arbos/tester.pyenvironments/templar/env.pysrc/crusades/affinetes/runner.py
8285333 to
20cf891
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
arbos/tester.py (1)
252-286:⚠️ Potential issue | 🟠 MajorTreat poll-time 401s as terminal failures.
environments/templar/env.pynow returns 401 for a bad bearer token, but this loop retries every non-200 except 404. If auth is rejected onGET /eval-status/{job_id}, the client will burn the whole deployment budget instead of failing fast.🛠️ Suggested fix
try: poll_resp = await poll_client.get(poll_url) consecutive_errors = 0 if poll_resp.status_code == 404: return EvalResult( success=False, error=f"Job {job_id} lost (404 on poll) — " "container may have restarted", ) + if poll_resp.status_code == 401: + return EvalResult( + success=False, + error="Polling unauthorized (401) — eval auth token was rejected", + ) if poll_resp.status_code != 200: logger.warning( f"Poll got {poll_resp.status_code}, retrying " f"in {poll_interval}s ({elapsed:.0f}s elapsed)"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@arbos/tester.py` around lines 252 - 286, The poll loop currently treats every non-200 (except 404) as transient; modify the loop so that when poll_resp.status_code == 401 you return an EvalResult(success=False, error=...) immediately instead of retrying; locate the async polling block using poll_client/poll_resp in tester.py (the while True loop that builds EvalResult) and add a branch for 401 that returns a clear error referencing job_id and poll_url (e.g., "auth rejected (401) on poll"), keeping existing handling for 404 and other status codes unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@environments/templar/env.py`:
- Around line 3316-3324: The handler evaluate currently accepts body:
EvaluateRequest so FastAPI parses the request body before your in-function auth
guard (_check_auth) runs; move the auth check out of the handler by either (A)
creating a dependency function check_auth (accepting Request) and changing
evaluate signature to use Depends(check_auth) so auth runs before body parsing,
or (B) register a global HTTP middleware that calls _check_auth (or a wrapper)
and returns the appropriate Response on failure; update references to
_check_auth/check_auth and keep EvaluateRequest and Request parameters unchanged
in evaluate after adding the dependency or middleware.
---
Outside diff comments:
In `@arbos/tester.py`:
- Around line 252-286: The poll loop currently treats every non-200 (except 404)
as transient; modify the loop so that when poll_resp.status_code == 401 you
return an EvalResult(success=False, error=...) immediately instead of retrying;
locate the async polling block using poll_client/poll_resp in tester.py (the
while True loop that builds EvalResult) and add a branch for 401 that returns a
clear error referencing job_id and poll_url (e.g., "auth rejected (401) on
poll"), keeping existing handling for 404 and other status codes unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 70957c74-cbf9-4eda-8db4-6c923ac41282
📒 Files selected for processing (3)
arbos/tester.pyenvironments/templar/env.pysrc/crusades/affinetes/runner.py
🚧 Files skipped from review as they are similar to previous changes (1)
- src/crusades/affinetes/runner.py
Each Basilica deployment now gets a unique auth token (secrets.token_urlsafe) passed via EVAL_AUTH_TOKEN env var. The eval container rejects any /evaluate or /eval-status request without the correct Authorization header (401). This prevents the OOM attack where a rogue miner pumps fake requests to our deployments. Health endpoint stays open for Basilica probes. Backward compatible — no token means open endpoints (local/dev mode).
20cf891 to
0337997
Compare
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
src/crusades/affinetes/runner.py (1)
858-898:⚠️ Potential issue | 🟠 MajorFail fast on unauthorized poll responses.
/eval-statusis now auth-gated, but this loop still treats 401/403 like transient noise. Because those responses also resetconsecutive_errors, a bad token will spin untilhttp_timeoutexpires instead of surfacing the auth problem immediately.💡 Suggested fix
async with httpx.AsyncClient(timeout=poll_timeout, headers=auth_headers) as poll_client: while True: ... try: poll_resp = await poll_client.get(poll_url) - consecutive_errors = 0 + if poll_resp.status_code in (401, 403): + return EvaluationResult.failure( + f"Basilica /eval-status auth failed: HTTP {poll_resp.status_code}: " + f"{poll_resp.text[:200]}", + task_id=task_id, + ) + + consecutive_errors = 0 if poll_resp.status_code == 404: ...🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/crusades/affinetes/runner.py` around lines 858 - 898, The polling loop in async with httpx.AsyncClient (around poll_client.get(poll_url)) incorrectly treats 401/403 as transient and resets consecutive_errors; update the logic in that block to detect poll_resp.status_code == 401 or 403 and immediately return an EvaluationResult.failure with a clear auth error (include job_id, poll_url, status code and task_id) instead of continuing or resetting consecutive_errors; keep existing handling for 404 and other non-200 codes unchanged and ensure consecutive_errors is not reset on auth errors.arbos/tester.py (1)
252-330:⚠️ Potential issue | 🟡 MinorDon't hide auth failures behind the polling timeout.
The tester now relies on the same bearer token for
/eval-status, but a permanent 401/403 still gets retried until the overall timeout. That makes auth regressions much harder to diagnose.💡 Suggested fix
async with httpx.AsyncClient(timeout=poll_timeout, headers=auth_headers) as poll_client: while True: ... try: poll_resp = await poll_client.get(poll_url) - consecutive_errors = 0 + if poll_resp.status_code in (401, 403): + return EvalResult( + success=False, + error=f"/eval-status auth failed: HTTP {poll_resp.status_code}: " + f"{poll_resp.text[:200]}", + ) + consecutive_errors = 0 if poll_resp.status_code == 404: ...🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@arbos/tester.py` around lines 252 - 330, The polling loop is hiding permanent auth failures (401/403) by retrying until the overall timeout; update the logic around poll_client.get / poll_resp.status_code to treat 401 and 403 as immediate hard errors: when poll_resp.status_code is 401 or 403 return an EvalResult(success=False, error=...) that includes the status code, job_id and response text/body to aid debugging instead of continuing; keep existing handling for 404 and other non-200 codes but ensure EvalResult, poll_client.get, poll_resp.status_code, and job_id are referenced in the new branch so the failure is surfaced immediately.
♻️ Duplicate comments (1)
environments/templar/env.py (1)
2974-2985:⚠️ Potential issue | 🟠 Major
Dependsstill lets unauthenticated/evaluaterequests parse the body.With
body: EvaluateRequeston Line 3316, FastAPI still reads/parses the request body before this dependency can reject the call. An unauthenticated caller can therefore keep burning JSON/Pydantic work with large payloads, which misses the goal of blocking rogue requests early. Move this check to HTTP middleware (or a customAPIRoute) so/evaluateis rejected before body handling.💡 Suggested direction
+@app.middleware("http") +async def _protect_eval_routes(request: Request, call_next): + if _AUTH_TOKEN and ( + request.url.path == "/evaluate" + or request.url.path.startswith("/eval-status/") + ): + auth = request.headers.get("authorization", "") + if auth != f"Bearer {_AUTH_TOKEN}": + return JSONResponse(status_code=401, content={"detail": "unauthorized"}) + return await call_next(request) + -@app.post("/evaluate", dependencies=[Depends(_verify_auth)]) +@app.post("/evaluate") async def evaluate(body: EvaluateRequest): ... -@app.get("/eval-status/{job_id}", dependencies=[Depends(_verify_auth)]) +@app.get("/eval-status/{job_id}") async def eval_status(job_id: str): ...In FastAPI, for a route like `@app.post("/evaluate", dependencies=[Depends(auth_dep)]) async def evaluate(body: EvaluateRequest): ...`, does FastAPI read/parse the request body before `auth_dep` runs? Please answer from FastAPI docs or source and include the order of body parsing vs dependency resolution.Also applies to: 3315-3316
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@environments/templar/env.py` around lines 2974 - 2985, The _verify_auth dependency runs too late (after FastAPI has already read/parsed the request body for routes like the /evaluate handler that accepts body: EvaluateRequest), so move the authentication check out of the dependency into a component that executes before body parsing — either an HTTP middleware (app.middleware("http")) that inspects the Authorization header and returns a 401 Response when it doesn't match _AUTH_TOKEN, or a custom APIRoute (override get_route_handler on your route class) that performs the same header check before awaiting request.body()/request.json(); replace usages of Depends(_verify_auth) on the evaluate endpoint with this early-check mechanism and keep the same rejection semantics (401, "unauthorized") and logging as in _verify_auth.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@arbos/tester.py`:
- Around line 252-330: The polling loop is hiding permanent auth failures
(401/403) by retrying until the overall timeout; update the logic around
poll_client.get / poll_resp.status_code to treat 401 and 403 as immediate hard
errors: when poll_resp.status_code is 401 or 403 return an
EvalResult(success=False, error=...) that includes the status code, job_id and
response text/body to aid debugging instead of continuing; keep existing
handling for 404 and other non-200 codes but ensure EvalResult, poll_client.get,
poll_resp.status_code, and job_id are referenced in the new branch so the
failure is surfaced immediately.
In `@src/crusades/affinetes/runner.py`:
- Around line 858-898: The polling loop in async with httpx.AsyncClient (around
poll_client.get(poll_url)) incorrectly treats 401/403 as transient and resets
consecutive_errors; update the logic in that block to detect
poll_resp.status_code == 401 or 403 and immediately return an
EvaluationResult.failure with a clear auth error (include job_id, poll_url,
status code and task_id) instead of continuing or resetting consecutive_errors;
keep existing handling for 404 and other non-200 codes unchanged and ensure
consecutive_errors is not reset on auth errors.
---
Duplicate comments:
In `@environments/templar/env.py`:
- Around line 2974-2985: The _verify_auth dependency runs too late (after
FastAPI has already read/parsed the request body for routes like the /evaluate
handler that accepts body: EvaluateRequest), so move the authentication check
out of the dependency into a component that executes before body parsing —
either an HTTP middleware (app.middleware("http")) that inspects the
Authorization header and returns a 401 Response when it doesn't match
_AUTH_TOKEN, or a custom APIRoute (override get_route_handler on your route
class) that performs the same header check before awaiting
request.body()/request.json(); replace usages of Depends(_verify_auth) on the
evaluate endpoint with this early-check mechanism and keep the same rejection
semantics (401, "unauthorized") and logging as in _verify_auth.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 84989510-0604-4bc6-acc9-b4df404da8e8
📒 Files selected for processing (3)
arbos/tester.pyenvironments/templar/env.pysrc/crusades/affinetes/runner.py
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
exploits/basilica_request_injection.md (1)
10-10: Tighten wording to avoid ambiguity in threat description.“Somewhat predictable” weakens the statement in a security writeup. Prefer a concrete phrasing (e.g., “predictable” or “enumerable”).
✍️ Suggested wording tweak
-2. Attacker finds the deployment URL (Basilica URLs are somewhat predictable) +2. Attacker discovers the deployment URL (Basilica URLs are predictable)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@exploits/basilica_request_injection.md` at line 10, Replace the ambiguous phrase "somewhat predictable" in the threat description with a concrete term to avoid weakening the claim; update the sentence in exploits/basilica_request_injection.md (the line containing "Basilica URLs are somewhat predictable") to use either "predictable" or "enumerable" (e.g., "Basilica URLs are predictable" or "Basilica URLs are enumerable") so the threat model is clear and unambiguous.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@exploits/basilica_request_injection.md`:
- Line 34: The sentence "Each deployment gets a unique token that exists only in
memory — never logged, never stored on disk." makes an absolute guarantee;
change it to a scoped or aspirational statement such as "Each deployment gets a
unique token that is kept only in memory in the current implementation; avoid
logging or storing it on disk" or "designed to be kept only in memory — do not
log or persist tokens." Update the line in
exploits/basilica_request_injection.md to remove the absolute "never" wording
and optionally add a short caveat noting this depends on current infra/logging
configuration.
- Around line 17-22: The fenced log snippet in
exploits/basilica_request_injection.md is missing a language tag (triggering
markdownlint MD040); edit the markdown block that contains the POST /evaluate
... torch.OutOfMemoryError lines and add a language identifier (e.g., use
```text) to the opening fence so the block becomes a labeled code block.
---
Nitpick comments:
In `@exploits/basilica_request_injection.md`:
- Line 10: Replace the ambiguous phrase "somewhat predictable" in the threat
description with a concrete term to avoid weakening the claim; update the
sentence in exploits/basilica_request_injection.md (the line containing
"Basilica URLs are somewhat predictable") to use either "predictable" or
"enumerable" (e.g., "Basilica URLs are predictable" or "Basilica URLs are
enumerable") so the threat model is clear and unambiguous.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 6a76ed09-1e19-479e-8c1e-462ffd53c3cc
📒 Files selected for processing (2)
README.mdexploits/basilica_request_injection.md
✅ Files skipped from review due to trivial changes (1)
- README.md
There was a problem hiding this comment.
🧹 Nitpick comments (1)
exploits/basilica_request_injection.md (1)
10-10: Consider tightening wording in the attack description.“Somewhat predictable” weakens the statement; a more direct phrasing would read cleaner in incident documentation.
✏️ Suggested wording tweak
-2. Attacker finds the deployment URL (Basilica URLs are somewhat predictable) +2. Attacker discovers the deployment URL (Basilica URLs are predictable enough to enumerate)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@exploits/basilica_request_injection.md` at line 10, The sentence "Attacker finds the deployment URL (Basilica URLs are somewhat predictable)" is weakly worded—replace the parenthetical "somewhat predictable" with a more direct phrase such as "are predictable" (e.g., "Attacker finds the deployment URL (Basilica URLs are predictable)") to tighten the attack description; locate the exact sentence containing "somewhat predictable" and update it accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@exploits/basilica_request_injection.md`:
- Line 10: The sentence "Attacker finds the deployment URL (Basilica URLs are
somewhat predictable)" is weakly worded—replace the parenthetical "somewhat
predictable" with a more direct phrase such as "are predictable" (e.g.,
"Attacker finds the deployment URL (Basilica URLs are predictable)") to tighten
the attack description; locate the exact sentence containing "somewhat
predictable" and update it accordingly.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: b47a4fd6-8824-4e97-ad26-e7ed22790d91
📒 Files selected for processing (4)
arbos/tester.pyexploits/basilica_request_injection.mdsrc/crusades/__init__.pysrc/crusades/affinetes/runner.py
✅ Files skipped from review due to trivial changes (1)
- src/crusades/init.py
🚧 Files skipped from review as they are similar to previous changes (1)
- arbos/tester.py
Hourly cron-friendly script that snapshots the SQLite DB safely (using Python's sqlite3.backup API) and uploads to Cloudflare R2 with timestamped names. Auto-rotates to keep last 48 backups.
Each Basilica deployment now gets a unique auth token (secrets.token_urlsafe) passed via EVAL_AUTH_TOKEN env var. The eval container rejects any /evaluate or /eval-status request without the correct Authorization header (401). This prevents the OOM attack where a rogue miner pumps fake requests to our deployments. Health endpoint stays open for Basilica probes. Backward compatible — no token means open endpoints (local/dev mode).
Summary by CodeRabbit
New Features
Stability
Documentation