Skip to content

fix: Gemini HARM_CATEGORY_JAILBREAK and Anthropic tool_result blocks#1867

Merged
jxnl merged 1 commit intomainfrom
fix/gemini-anthropic-critical-fixes
Oct 27, 2025
Merged

fix: Gemini HARM_CATEGORY_JAILBREAK and Anthropic tool_result blocks#1867
jxnl merged 1 commit intomainfrom
fix/gemini-anthropic-critical-fixes

Conversation

@jxnl
Copy link
Copy Markdown
Collaborator

@jxnl jxnl commented Oct 27, 2025

Summary

This PR combines critical fixes from PR #1863 and PR #1832.

Changes

1. Handle missing HARM_CATEGORY_JAILBREAK in google.genai

Problem: The HARM_CATEGORY_JAILBREAK category is only available in Vertex AI, not in the google.genai API.

Solution:

  • Use hasattr check for HARM_CATEGORY_JAILBREAK compatibility
  • Ensures code works across different google.genai versions
  • Fixes 400 errors when using Gemini API safety settings

Files: instructor/providers/gemini/utils.py, tests/llm/test_genai/test_utils.py

2. Ensure tool_result blocks for all Anthropic tool_use responses

Problem: When validation fails in ANTHROPIC_TOOLS mode, retry mechanism creates malformed messages.

Solution:

  • Simplified condition to always create tool_result block for any tool_use
  • Removes check for exception type and title matching
  • Fixes 400 error from Anthropic API

Files: instructor/providers/anthropic/utils.py

Testing

Run tests:

uv run pytest tests/llm/test_genai/test_utils.py -v

Credits

Co-authored-by: DaveOkpare DaveOkpare@users.noreply.github.com
Co-authored-by: Claude noreply@anthropic.com


Important

Fixes handling of HARM_CATEGORY_JAILBREAK in google.genai and ensures tool_result blocks for Anthropic tool_use responses.

  • Behavior:
    • Fix missing HARM_CATEGORY_JAILBREAK handling in update_genai_kwargs() in utils.py for google.genai.
    • Ensure tool_result blocks are created for all tool_use responses in reask_anthropic_tools() in utils.py.
  • Testing:
    • Updated tests in test_utils.py to verify HARM_CATEGORY_JAILBREAK exclusion and tool_result block creation.
  • Misc:
    • Simplified condition checks in reask_anthropic_tools() to remove exception type and title matching.

This description was created by Ellipsis for 80970be. You can customize this summary. It will automatically update as commits are pushed.

Combines critical fixes from PR #1863 and PR #1832:

1. Handle missing HARM_CATEGORY_JAILBREAK attribute in google.genai
   - Use hasattr check for compatibility across google.genai versions
   - HARM_CATEGORY_JAILBREAK is Vertex AI only, not in google.genai API
   - Fixes 400 errors with Gemini API safety settings

2. Ensure tool_result blocks for all Anthropic tool_use responses
   - Simplifies overly restrictive condition in reask_anthropic_tools
   - Always creates tool_result block when tool_use is present
   - Fixes 400 error during validation retries (#1829)

Co-authored-by: DaveOkpare <DaveOkpare@users.noreply.github.com>
Co-authored-by: Claude <noreply@anthropic.com>
@claude
Copy link
Copy Markdown
Contributor

claude Bot commented Oct 27, 2025

Claude finished @jxnl's task —— View job


Comprehensive PR Review ✅

PR #1867: fix: Gemini HARM_CATEGORY_JAILBREAK and Anthropic tool_result blocks

Overall Assessment

This PR addresses two critical compatibility issues that were causing 400 errors in production. The fixes are well-implemented, focused, and include appropriate test coverage.


Fix 1: Gemini HARM_CATEGORY_JAILBREAK Handling

Problem Addressed: The HARM_CATEGORY_JAILBREAK enum is only available in Vertex AI, not in the standard google.genai API, causing 400 errors when safety settings were configured.

Solution Review:

  • Excellent approach: Uses hasattr() check for compatibility across different google.genai versions
  • Clean implementation: Creates an excluded_categories set and conditionally adds the problematic category
  • Backward compatible: Code works with both old and new versions of the SDK

Code Quality: ⭐⭐⭐⭐⭐

# Before: Hard-coded exclusion
if c != HarmCategory.HARM_CATEGORY_UNSPECIFIED

# After: Dynamic compatibility check  
excluded_categories = {HarmCategory.HARM_CATEGORY_UNSPECIFIED}
if hasattr(HarmCategory, 'HARM_CATEGORY_JAILBREAK'):
    excluded_categories.add(HarmCategory.HARM_CATEGORY_JAILBREAK)

Fix 2: Anthropic tool_result Blocks

Problem Addressed: Complex validation logic in reask_anthropic_tools() was preventing proper tool_result block creation, leading to malformed messages and 400 errors from Anthropic API.

Solution Review:

  • Simplified logic: Removed overly restrictive condition checking exception.title == content.name
  • Robust behavior: Now creates tool_result blocks for ANY tool_use, ensuring API compliance
  • Better error handling: Eliminates edge cases where validation errors weren't properly communicated

Code Quality: ⭐⭐⭐⭐⭐

# Before: Complex, fragile condition
if (content.type == "tool_use" 
    and isinstance(exception, (ValidationError, InstructorValidationError))
    and content.name == exception.title):

# After: Simple, reliable condition
if content.type == "tool_use":

Test Coverage Assessment

Strengths:

  • Comprehensive Gemini tests: Updated tests in test_utils.py properly mirror the production logic
  • Test pattern consistency: All three test functions use identical hasattr() checks
  • Edge case coverage: Tests verify behavior both with and without custom safety settings

Missing Tests:

  • No direct unit tests for the Anthropic reask_anthropic_tools() simplification
  • However, this is acceptable as the change is defensive and reduces complexity rather than adding new behavior

🔒 Security Assessment

No security concerns identified:

  • Both fixes are defensive in nature, improving compatibility
  • Changes don't expose new attack vectors
  • Safety settings logic maintains security boundaries
  • Anthropic tool handling remains within API specifications

Performance Impact

Positive performance characteristics:

  • Gemini fix: Minimal overhead from hasattr() check (one-time cost)
  • Anthropic fix: Actually improves performance by removing unnecessary condition checks
  • No additional API calls or computational complexity introduced

🏆 Code Quality & Best Practices

Excellent adherence to best practices:

  • DRY principle: Test logic properly duplicates production logic for verification
  • Error handling: Graceful degradation when SDK features aren't available
  • Readability: Clear, self-documenting code with helpful comments
  • Maintainability: Changes are isolated and don't affect other functionality

📝 Minor Suggestions

  1. Consider extracting constant: The excluded_categories logic could be extracted to a module-level constant for reusability, but current approach is fine for scope.

  2. Documentation: Both fixes include clear inline comments explaining the rationale.


Final Recommendation: APPROVE

This PR successfully addresses two critical production issues with:

  • ✅ Correct technical solutions
  • ✅ Appropriate test coverage
  • ✅ No security concerns
  • ✅ Positive performance impact
  • ✅ Clean, maintainable code
  • ✅ Backward compatibility

The fixes are well-scoped, defensive in nature, and directly address the root causes of the 400 errors. Ready for merge.


@github-actions github-actions Bot added bug Something isn't working python Pull requests that update python code size:M This PR changes 30-99 lines, ignoring generated files. labels Oct 27, 2025
Copy link
Copy Markdown
Contributor

@ellipsis-dev ellipsis-dev Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed everything up to 80970be in 1 minute and 7 seconds. Click for details.
  • Reviewed 76 lines of code in 3 files
  • Skipped 0 files when reviewing.
  • Skipped posting 3 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. instructor/providers/anthropic/utils.py:159
  • Draft comment:
    Simplified tool_use check now drops exception-type and title matching. Ensure this broader handling is intentional.
  • Reason this comment was not posted:
    Comment did not seem useful. Confidence is useful = 0% <= threshold 85% The comment is asking the PR author to ensure that the broader handling of the tool_use check is intentional. This falls under the rule of not asking the author to confirm their intention or ensure behavior is intended. Therefore, this comment should be removed.
2. instructor/providers/gemini/utils.py:252
  • Draft comment:
    Using hasattr to conditionally add HARM_CATEGORY_JAILBREAK ensures compatibility. Consider centralizing this exclusion logic if it’s reused.
  • Reason this comment was not posted:
    Comment looked like it was already resolved.
3. tests/llm/test_genai/test_utils.py:38
  • Draft comment:
    Test logic duplicates the excluded_categories computation. Consider factoring into a helper for DRY, though tests remain clear.
  • Reason this comment was not posted:
    Confidence changes required: 50% <= threshold 85% None

Workflow ID: wflow_pWtDYANCofCZqMcO

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

@cloudflare-workers-and-pages
Copy link
Copy Markdown

Deploying with  Cloudflare Workers  Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status Name Latest Commit Preview URL Updated (UTC)
✅ Deployment successful!
View logs
instructor 80970be Commit Preview URL

Branch Preview URL
Oct 27 2025, 06:46 PM

@jxnl jxnl enabled auto-merge (squash) October 27, 2025 18:54
@jxnl jxnl merged commit 02f4b0b into main Oct 27, 2025
11 of 15 checks passed
@jxnl jxnl deleted the fix/gemini-anthropic-critical-fixes branch October 27, 2025 18:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working python Pull requests that update python code size:M This PR changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant