fix(openai): remove duplicate schema from messages in JSON_SCHEMA mode#1761
Merged
jxnl merged 6 commits into567-labs:mainfrom Sep 11, 2025
Merged
Conversation
Contributor
There was a problem hiding this comment.
Important
Looks good to me! 👍
Reviewed everything up to f3af7fb in 1 minute and 14 seconds. Click for details.
- Reviewed
44lines of code in1files - Skipped
0files when reviewing. - Skipped posting
2draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. instructor/providers/openai/utils.py:436
- Draft comment:
Good change: The 'if mode != Mode.JSON_SCHEMA' condition properly avoids adding duplicate schema information in JSON_SCHEMA mode. Consider adding an inline comment to clarify this design decision for future maintainers. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 85% The comment is about documenting a design decision in the code. While the suggestion is reasonable, the function namehandle_json_modesand the context make it fairly clear what's happening. The code is not complex enough to warrant additional inline documentation. The comment also starts with "Good change:" which is not actionable. The design decision could be non-obvious to new contributors. Documentation can help prevent future regressions. The function name and surrounding context provide sufficient clarity. The code change is straightforward and the reason for skipping JSON_SCHEMA mode is evident from the mode handling logic throughout the file. Delete the comment as it suggests adding documentation that isn't necessary given the clear context and straightforward code change.
2. instructor/providers/openai/utils.py:437
- Draft comment:
Note: The code assumes new_kwargs['messages'] is non-empty. It might be prudent to add a guard or document this requirement to avoid potential IndexError. - Reason this comment was not posted:
Confidence changes required:80%<= threshold85%None
Workflow ID: wflow_bjqLYxK1hjtcwoBW
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
Contributor
Author
|
Hi @jxnl, just following up on this PR. It’s a small change that removes redundant schema duplication in JSON_SCHEMA mode, reducing token usage without affecting JSON or MD_JSON modes. Would appreciate it if you could take a look when you get a chance. |
Contributor
Author
|
Hi @jxnl, quick follow-up. I see auto-merge is enabled, but CI is blocking; logs show OIDC token errors in the claude-review job and missing provider API keys in provider tests. Seems unrelated to this diff, but happy to tweak if needed. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Removes redundant schema information from messages when using
JSON_SCHEMAmode.Why This Change?
JSON mode (
response_format: {"type": "json_object"}) - OpenAI docs require explicit JSON instruction in messages since no schema is provided in response_format.https://platform.openai.com/docs/guides/structured-outputs?api-mode=chat#json-mode
JSON_SCHEMA mode (
response_format: {"type": "json_schema", ...}) - Schema is already provided in response_format. Adding the same schema to messages creates redundancy, increases token consumption unnecessarily, and provides no additional value to the model.Changes
JSON_SCHEMAmode: No schema added to messages (schema already in response_format)JSONandMD_JSONmodes: Unchanged behavior (still add schema to messages as required)Important
Removes redundant schema from messages in
JSON_SCHEMAmode inhandle_json_modes()inutils.py, reducing token consumption.handle_json_modes()inutils.py,JSON_SCHEMAmode no longer adds schema to messages, as it's already inresponse_format.JSONandMD_JSONmodes remain unchanged, still adding schema to messages.JSON_SCHEMAmode by not duplicating schema in messages.This description was created by
for f3af7fb. You can customize this summary. It will automatically update as commits are pushed.