Skip to content

fix(processing): ensure JSON decode errors are caught by retry; add regression tests for JSON mode (#1856)#1857

Merged
jxnl merged 2 commits intomainfrom
devin/1761016877-json-retry-fix
Oct 27, 2025
Merged

fix(processing): ensure JSON decode errors are caught by retry; add regression tests for JSON mode (#1856)#1857
jxnl merged 2 commits intomainfrom
devin/1761016877-json-retry-fix

Conversation

@devin-ai-integration
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot commented Oct 21, 2025

fix(processing): ensure JSON decode errors are caught by retry handler (#1856)

Summary

Fixed a bug where JSONDecodeError in JSON mode was being wrapped in ValueError, causing it to bypass the retry mechanism's validation error handler. This prevented handle_reask_kwargs from being called, so retries didn't receive error feedback to improve subsequent attempts.

Root cause: In _validate_model_from_json, line 95 was wrapping JSONDecodeError in ValueError. The retry handler only catches ValidationError, JSONDecodeError, and InstructorValidationError explicitly, so the wrapped error fell through to the generic Exception handler.

Fix: Changed line 95 from raise ValueError(f"Failed to parse JSON: {e}") from e to just raise, allowing JSONDecodeError to propagate directly to the retry handler.

Changes:

  • Modified instructor/processing/function_calls.py: Removed ValueError wrapping (1 line change)
  • Updated tests/test_json_extraction.py: Fixed existing test + added test for non-strict mode
  • Added tests/test_retry_json_mode.py: New regression tests for JSON/validation error retry behavior

Review & Testing Checklist for Human

⚠️ Risk Level: Medium - Touches critical retry mechanism but changes are minimal and well-tested

  • Test with real API calls: Create a scenario where an LLM returns invalid JSON in JSON mode with max_retries > 0. Verify that:
    • The retry mechanism now properly catches the error
    • handle_reask_kwargs is called between attempts
    • Error feedback is injected into subsequent retry messages
  • Verify no regressions: Check that existing code doesn't rely on ValueError being raised from _validate_model_from_json. Search codebase for catches of ValueError that might be affected.
  • Test across modes: Verify the fix works in both strict and non-strict validation modes (Pydantic raises different exceptions in each case)
  • CI checks: Ensure all provider integration tests pass, not just core tests

Test Plan Recommendation

import instructor
from pydantic import BaseModel
from openai import OpenAI

class User(BaseModel):
    name: str
    age: int

# Force invalid JSON response and verify retry mechanism works
client = instructor.from_openai(OpenAI(), mode=instructor.Mode.JSON)
try:
    # Use a prompt likely to produce invalid JSON or mock the response
    result = client.chat.completions.create(
        model="gpt-4o-mini",
        response_model=User,
        messages=[{"role": "user", "content": "Return invalid JSON"}],
        max_retries=2
    )
except instructor.core.exceptions.InstructorRetryException as e:
    # Verify failed_attempts contains JSONDecodeError or ValidationError
    print(f"Attempts: {e.n_attempts}, Failed: {len(e.failed_attempts)}")
    for attempt in e.failed_attempts:
        print(f"  Attempt {attempt.attempt_number}: {type(attempt.exception)}")

Notes

  • Tests pass locally (core tests and new regression tests)
  • The fix is minimal (1 line) but touches a critical code path
  • Pydantic's behavior differs between strict/non-strict mode, both are now handled correctly
  • Regression tests use mocks, so real-world validation is important

Session: Requested by Jason Liu (@jxnl) - https://app.devin.ai/sessions/9184a01bca9d4e7e97e3aade131cf0ea


Important

Fixes JSON decode error handling in retry mechanism by removing ValueError wrapping in function_calls.py and adds regression tests.

  • Behavior:
    • Fixes bug in _validate_model_from_json in function_calls.py by removing ValueError wrapping for JSONDecodeError, allowing it to be caught by the retry handler.
    • Ensures handle_reask_kwargs is called for retries, providing error feedback for subsequent attempts.
  • Tests:
    • Updates test_json_extraction.py to include tests for JSON decode errors in strict and non-strict modes.
    • Adds test_retry_json_mode.py with regression tests to verify retry behavior for JSONDecodeError and ValidationError.
  • Misc:
    • Minor logging changes in function_calls.py to improve error visibility.

This description was created by Ellipsis for dd09136. You can customize this summary. It will automatically update as commits are pushed.

@devin-ai-integration
Copy link
Copy Markdown
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@github-actions github-actions Bot added bug Something isn't working enhancement New feature or request python Pull requests that update python code labels Oct 21, 2025
Copy link
Copy Markdown
Contributor

@ellipsis-dev ellipsis-dev Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed everything up to 42801ce in 1 minute and 27 seconds. Click for details.
  • Reviewed 141 lines of code in 3 files
  • Skipped 0 files when reviewing.
  • Skipped posting 3 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. instructor/processing/function_calls.py:95
  • Draft comment:
    Re-raising the original JSONDecodeError (instead of wrapping it in ValueError) ensures it gets caught by the retry mechanism. Consider adding an inline comment referencing issue #1856 so future maintainers understand why this change was made.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 80% vs. threshold = 85% The comment makes a good point about documenting the rationale behind error handling changes. The change from wrapping to re-raising appears intentional and tied to retry behavior. Without documentation, future maintainers may not understand why this specific error handling approach was chosen and revert it. The comment assumes issue #1856 exists and is relevant, but we don't have access to verify that. The retry mechanism mentioned isn't visible in the code shown. Even without seeing issue #1856, documenting error handling decisions is a good practice, especially when changing from a more explicit approach (wrapping in ValueError) to a less explicit one (re-raising). Keep the comment as it promotes good documentation practices for error handling changes that may not be immediately obvious to future maintainers.
2. tests/test_retry_json_mode.py:23
  • Draft comment:
    The new regression tests correctly simulate retry behavior for both JSONDecodeError (in non-strict mode) and ValidationError (in strict mode). This effectively ensures that the retry handler processes these errors as intended.
  • Reason this comment was not posted:
    Comment did not seem useful. Confidence is useful = 0% <= threshold 85% This comment is purely informative and does not provide any actionable feedback or suggestions for improvement. It simply praises the existing tests without suggesting any changes or identifying any issues.
3. tests/test_retry_json_mode.py:28
  • Draft comment:
    Typo: In the docstring, "handle_reask_kwargs" appears to be a misspelling. Did you mean "handle_retry_kwargs"?
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 80% vs. threshold = 85% The comment is about a real change in the diff since this is a new file. The suggestion seems reasonable as 'reask' vs 'retry' is inconsistent with the terminology used elsewhere in the file. However, without access to the actual codebase, I can't be 100% certain that 'handle_reask_kwargs' isn't a real function name somewhere. I don't have access to the full codebase to verify if 'handle_reask_kwargs' might be a valid function name in some other file. The comment could be wrong if this is actually the correct function name. While I can't be 100% certain, the consistent use of 'retry' throughout this file and the test name itself strongly suggests this is indeed a typo that should be fixed for clarity and consistency. Keep the comment as it appears to identify a genuine inconsistency in terminology that should be fixed for better code clarity and maintenance.

Workflow ID: wflow_mIBQoYgdlW1x5SVX

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages Bot commented Oct 21, 2025

Deploying with  Cloudflare Workers  Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status Name Latest Commit Preview URL Updated (UTC)
✅ Deployment successful!
View logs
instructor b4bc8a5 Commit Preview URL

Branch Preview URL
Oct 27 2025, 07:05 PM

Copy link
Copy Markdown
Contributor

@ellipsis-dev ellipsis-dev Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed dd09136 in 28 seconds. Click for details.
  • Reviewed 13 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 1 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. tests/test_retry_json_mode.py:10
  • Draft comment:
    Good cleanup: the unused 'patch' import was removed. This adheres to DRY principles; ensure that no test is relying on it.
  • Reason this comment was not posted:
    Confidence changes required: 0% <= threshold 85% None

Workflow ID: wflow_imGJV6cqVDN0pOdw

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

@jxnl jxnl enabled auto-merge (rebase) October 21, 2025 13:01
devin-ai-integration Bot and others added 2 commits October 27, 2025 14:58
…tches it; add regression tests for JSON/Validation errors in retry (closes #1856)

Co-Authored-By: Jason Liu <[email protected]>
@jxnl jxnl force-pushed the devin/1761016877-json-retry-fix branch from dd09136 to b4bc8a5 Compare October 27, 2025 18:58
@jxnl jxnl merged commit 3c01abc into main Oct 27, 2025
1 check failed
@jxnl jxnl deleted the devin/1761016877-json-retry-fix branch October 27, 2025 18:58
@jxnl jxnl added the status:pending-merge Related PR is pending merge label Oct 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working enhancement New feature or request python Pull requests that update python code status:pending-merge Related PR is pending merge

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant