Skip to content

letta_judge grader with agent_id validates but fails at runtime #156

@jasonlarkin

Description

@jasonlarkin

Problem Summary

The letta_judge grader with agent_id parameter validates successfully but fails at runtime with:

FileNotFoundError: Agent file not found: /path/to/.venv/lib/python3.11/site-packages/letta_evals/graders/letta-evals-judge-agent.af

Expected Behavior

When using kind: letta_judge with agent_id in suite.yaml, the grader should use the specified Letta Cloud agent (identified by agent_id) to perform evaluations, leveraging Letta's built-in LLM access without requiring an OpenAI API key.

Actual Behavior

The suite configuration validates successfully:

$ letta-evals validate evals/letta/suite.yaml
✓ Suite 'casamigo-buyer-agent-eval' is valid

But at runtime, it fails because the code looks for a bundled agent file (letta-evals-judge-agent.af) that doesn't exist, instead of using the agent_id provided.

Configuration

graders:
  casamigo_rubric_grader:
    kind: letta_judge
    agent_id: agent-10d4286d-74a9-4fc6-82ff-a751bda72449
    prompt_path: rubric.md
    extractor: last_assistant

Environment

  • letta-evals version: 0.9.0
  • Python: 3.11
  • Platform: Linux (WSL)

Error Traceback

File "/mnt/c/Users/jason/Documents/casamigo/casamigo-letta/backend/.venv/lib/python3.11/site-packages/letta_evals/graders/agent_judge.py", line 119, in _validate_agent_file
    raise FileNotFoundError(f"Agent file not found: {self.agent_file}")
FileNotFoundError: Agent file not found: /mnt/c/Users/jason/Documents/casamigo/casamigo-letta/backend/.venv/lib/python3.11/site-packages/letta_evals/graders/letta-evals-judge-agent.af

Evidence

  1. Configuration validation passes: The suite.yaml with agent_id validates successfully, indicating the schema accepts this parameter:

    $ letta-evals validate evals/letta/suite.yaml
    ✓ Suite 'casamigo-buyer-agent-eval' is valid
  2. Documentation indicates support: Official documentation suggests letta_judge with agent_id is the correct approach to use Letta's built-in LLM access without requiring OpenAI API keys.

  3. Runtime implementation behavior: Despite validation passing, the runtime code path in AgentJudgeGrader._validate_agent_file() always looks for a bundled agent file (letta-evals-judge-agent.af) regardless of whether agent_id is provided. The code appears to:

    • Always call _validate_agent_file() during initialization
    • Look for a hardcoded file path (letta-evals-judge-agent.af)
    • Not check for or use the provided agent_id parameter
  4. Error traceback location: The error occurs in agent_judge.py line 119 in _validate_agent_file(), which is called during initialization, suggesting the file validation happens unconditionally.

Steps to Reproduce

  1. Create a Letta agent to act as grader (e.g., with submit_grade tool)
  2. Configure suite.yaml with kind: letta_judge and agent_id: <your-agent-id>
  3. Run letta-evals validate suite.yaml - ✅ validates successfully
  4. Run letta-evals run suite.yaml - ❌ fails with FileNotFoundError

Expected Fix

The AgentJudgeGrader should:

  1. When agent_id is provided, use that agent via Letta Cloud API
  2. When agent_file is provided, use the local .af file
  3. Not look for a bundled letta-evals-judge-agent.af file when agent_id is specified

Related Context

The model_judge grader bypasses Letta's built-in GPT-4.1 access and requires a direct OpenAI API key, even though Letta provides model access. The letta_judge (agent-as-judge) approach is documented as the solution to use Letta's built-in LLM access, but the runtime implementation appears to only support agent_file and not agent_id.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions