Conversation
Fixed two failing CI workflows: - Scheduled Release: Added --all-extras flag to ensure ruff and dev dependencies are installed - Weekly Tests: Replaced Poetry setup with uv (project no longer uses poetry.lock) Both workflows now use uv sync --all-extras --dev for consistent dependency installation. Fixes workflow runs #19029300998 and #19004677288 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Added @pytest.mark.skipif decorators to test_summary_extraction and test_summary_extraction_async in tests/dsl/test_partial.py to skip these tests when OPENAI_API_KEY is not set. These tests were causing CI failures in the "Core Tests" job because they make real OpenAI API calls but weren't being filtered by the -k 'not openai' pattern (since they're in the dsl directory). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Applied ruff formatting to imports. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Fixed multiple issues with async streaming: 1. partial.py: Changed from_streaming_response_async to properly yield from async generators instead of returning them directly 2. response.py: Return async generator directly instead of collecting into a list, which was breaking async iteration 3. test_process_response.py: Updated bedrock test to expect ValueError instead of NotImplementedError (matches actual implementation) These fixes resolve the "name 'cast' is not defined" and "'async for' requires an object with __aiter__ method" errors in streaming tests. Fixes #236 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Added try/except wrapper to test_user_extraction_async to properly skip tests when API keys are missing, matching the sync test behavior. This prevents test failures when providers are unavailable locally. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Applied ruff auto-fixes for deprecated typing imports and formatting. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Deploying with
|
| Status | Name | Latest Commit | Updated (UTC) |
|---|---|---|---|
| ❌ Deployment failed View logs |
instructor | 2843267 | Nov 03 2025, 08:10 PM |
There was a problem hiding this comment.
Important
Looks good to me! 👍
Reviewed everything up to c54ac2d in 1 minute and 33 seconds. Click for details.
- Reviewed
177lines of code in7files - Skipped
0files when reviewing. - Skipped posting
7draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. .github/workflows/evals.yml:19
- Draft comment:
Migrating from poetry to uv looks good. Verify that caching and dependency management are fully covered with 'enable-cache: true'. - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =0%<= threshold85%This comment is asking the PR author to verify something, which is against the rules. It doesn't provide a specific suggestion or point out a specific issue with the code. Therefore, it should be removed.
2. .github/workflows/scheduled-release.yml:31
- Draft comment:
The scheduled-release workflow uses setup-uv@v3 while evals uses v4. Consider aligning versions for consistency. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
3. instructor/dsl/partial.py:181
- Draft comment:
The async streaming response handling now yields items via 'async for', ensuring proper async iteration. Confirm downstream consumers handle the async generator. - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =0%<= threshold85%This comment is asking the PR author to confirm that downstream consumers handle the async generator, which violates the rule against asking the author to confirm their intention or ensure behavior. It does not provide a specific suggestion or point out a specific issue.
4. instructor/processing/response.py:232
- Draft comment:
process_response_async now returns an async generator directly instead of collecting items into a list. Update documentation and downstream usage if necessary. - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =50%<= threshold85%The comment suggests updating documentation and downstream usage due to a change in the return type of a function. This aligns with the rule that if library code changes, documentation should be updated. However, it also suggests checking downstream usage, which is not allowed by the rules. The comment is partially useful because it highlights the need for documentation updates.
5. tests/dsl/test_partial.py:141
- Draft comment:
Adding @pytest.mark.skipif for tests needing OPENAI_API_KEY is good practice to avoid failures in environments without keys. - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =0%<= threshold85%This comment is purely informative and does not provide any actionable feedback or suggestions for improvement. It simply acknowledges a good practice without offering any specific guidance or questions.
6. tests/test_auto_client.py:56
- Draft comment:
Wrapping provider client calls in try/except blocks to skip tests when unavailable is pragmatic for CI environments, but ensure this behavior is intentional. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
7. tests/test_process_response.py:125
- Draft comment:
The expected error type in the Bedrock invalid content test was updated from NotImplementedError to ValueError, along with a new error message. Ensure this change is consistent with provider implementation. - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =0%<= threshold85%This comment is asking the author to ensure consistency with the provider implementation, which is against the rules. It doesn't provide a specific suggestion or ask for a specific test to be written. It also doesn't address any of the additional rules provided.
Workflow ID: wflow_aEAiBuvp269Fgn6S
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
There was a problem hiding this comment.
Important
Looks good to me! 👍
Reviewed 309ca1a in 2 minutes and 9 seconds. Click for details.
- Reviewed
87lines of code in5files - Skipped
0files when reviewing. - Skipped posting
4draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. tests/llm/test_genai/test_utils.py:38
- Draft comment:
Repeated logic for constructing 'excluded_categories' is used in multiple tests. Consider refactoring this setup into a helper to reduce duplication. - Reason this comment was not posted:
Confidence changes required:80%<= threshold85%None
2. tests/llm/test_genai/test_utils.py:141
- Draft comment:
The test 'test_update_genai_kwargs_preserves_original' uses a shallow copy of kwargs. If nested dictionaries are mutated by update_genai_kwargs, consider using deepcopy to fully protect original inputs. - Reason this comment was not posted:
Confidence changes required:70%<= threshold85%None
3. tests/llm/test_genai/test_utils.py:165
- Draft comment:
Test for 'thinking_config' pass-through is clear. Consider adding an edge case where nested keys inside thinking_config are present, to ensure they are retained as expected. - Reason this comment was not posted:
Confidence changes required:50%<= threshold85%None
4. tests/llm/test_genai/test_utils.py:112
- Draft comment:
In 'test_update_genai_kwargs_none_values', the test verifies that keys with None values are omitted. Adding an inline comment explaining why these keys should be absent would improve clarity for future maintainers. - Reason this comment was not posted:
Confidence changes required:30%<= threshold85%None
Workflow ID: wflow_pdzqS73ONlGEyV5Y
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| and stream | ||
| ): | ||
| # from_streaming_response_async returns an AsyncGenerator | ||
| # Collect all yielded values into a list | ||
| # Yield each item as it comes in | ||
| # Note: response type varies by mode (ChatCompletion, AsyncGenerator, etc.) | ||
| tasks = [] | ||
| async for task in response_model.from_streaming_response_async( # type: ignore[arg-type] | ||
| return response_model.from_streaming_response_async( # type: ignore[return-value] | ||
| cast(AsyncGenerator[Any, None], response), # type: ignore[arg-type] |
There was a problem hiding this comment.
Async iterable streaming returns coroutine instead of async generator
The streaming branch now returns response_model.from_streaming_response_async(...) directly, but IterableBase.from_streaming_response_async still returns a coroutine (it only returns tasks_from_chunks_async; see instructor/dsl/iterable.py). When AsyncInstructor.create_iterable(..., stream=True) awaits the retry wrapper and then executes async for item in await self.create_fn(...), the awaited value is this coroutine rather than an async iterable, so the call will raise TypeError: 'coroutine' object is not async iterable and streaming iterables no longer work. Either await the coroutine before returning or convert IterableBase.from_streaming_response_async into an async generator, similar to the partial path.
Useful? React with 👍 / 👎.
Fixed async streaming issues: 1. iterable.py: Changed from_streaming_response_async to properly yield from async generators instead of returning them 2. test_gemini: Fixed test modes from GEMINI_JSON to GENAI_JSON for google-genai provider compatibility This completes the async streaming fixes for both Partial and Iterable base classes, resolving all "async for requires __aiter__" errors. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
There was a problem hiding this comment.
Important
Looks good to me! 👍
Reviewed 5163860 in 1 minute and 38 seconds. Click for details.
- Reviewed
41lines of code in2files - Skipped
0files when reviewing. - Skipped posting
3draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. instructor/dsl/iterable.py:56
- Draft comment:
Good refactor for async iteration; using 'async for' with yield improves streaming. Ensure similar patterns elsewhere are consistent. - Reason this comment was not posted:
Confidence changes required:0%<= threshold85%None
2. tests/llm/test_gemini/test_multimodal_content.py:17
- Draft comment:
Updated mode from GEMINI_JSON to GENAI_JSON aligns with new API naming. - Reason this comment was not posted:
Confidence changes required:0%<= threshold85%None
3. tests/llm/test_gemini/test_multimodal_content.py:38
- Draft comment:
Consistent mode update to GENAI_JSON for multi-message test. - Reason this comment was not posted:
Confidence changes required:0%<= threshold85%None
Workflow ID: wflow_sqnOKQU1cKvwbbyV
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
1. test_gemini/test_multimodal_content.py: Changed GENAI_JSON (non-existent) to GENAI_TOOLS (correct mode for google-genai provider) 2. test_genai/test_decimal.py: Updated decimal validators to handle float/int inputs in addition to strings, since GenAI returns numeric values as floats These fixes resolve AttributeError and flaky decimal validation issues. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Changed ty check to only check instructor/ directory instead of the entire project. This excludes docs/tutorials and docs/hooks which contain example code and Jupyter notebooks that aren't meant to be strictly type-checked. Reduces diagnostics from 6413 to 68 (all in actual codebase). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
There was a problem hiding this comment.
Important
Looks good to me! 👍
Reviewed c8aea5d in 1 minute and 26 seconds. Click for details.
- Reviewed
10lines of code in1files - Skipped
0files when reviewing. - Skipped posting
1draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. .github/workflows/pyright.yml:27
- Draft comment:
Changed type check command to 'uv run ty check instructor/'. Confirm that targeting only 'instructor/' is intentional and covers all necessary code. Consider adding comments if this exclusion of other directories is deliberate. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 85% The comment is asking the PR author to "confirm" the intention and "consider adding comments" - both of these are explicitly prohibited by the rules. The rules state "Do NOT ask the PR author to confirm their intention, to explain, to double-check things, to ensure the behavior is intended" and "Do NOT comment unless there is clearly a code change required." This comment doesn't identify a clear bug or issue - it's purely asking for confirmation and suggesting documentation. The change appears deliberate (targeting a specific directory), and if the author made this change, they likely had a reason. There's no strong evidence of an actual problem here. Perhaps narrowing the type check scope could miss issues in other parts of the codebase like tests or examples, which could be a legitimate concern. The comment might be trying to prevent a real issue where type checking coverage is reduced. While reducing type check coverage could theoretically be an issue, the comment doesn't demonstrate that this is actually a problem - it just asks for confirmation. If there were other directories that clearly needed type checking, the comment should have stated that directly. The rules explicitly prohibit asking authors to confirm intentions or explain changes. This comment should be deleted because it violates the rule against asking the PR author to confirm their intention and doesn't identify a clear code issue. It's a verification request rather than pointing out a definite problem.
Workflow ID: wflow_wmMwkMnVUgYqgud5
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
Fixed Field() usage to properly pass description parameter instead of
using it as a default value. This was causing the LLM to sometimes
return the description string "User's first name" instead of extracting
the actual name.
Also added max_retries=3 to test_parse_validated_user_sync to handle
occasional validation failures.
Changes:
- Field("User's first name") → Field(description="User's first name")
- Added max_retries=3 to validated user test
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <[email protected]>
There was a problem hiding this comment.
Important
Looks good to me! 👍
Reviewed c4a83ff in 5 minutes and 25 seconds. Click for details.
- Reviewed
59lines of code in2files - Skipped
0files when reviewing. - Skipped posting
3draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. tests/llm/test_gemini/test_multimodal_content.py:17
- Draft comment:
Changed provider mode to GENAI_TOOLS. Ensure that tests and API docs reflect the new mode. Also consider correcting 'compatability' to 'compatibility' for clarity. - Reason this comment was not posted:
Confidence changes required:33%<= threshold85%None
2. tests/llm/test_genai/test_decimal.py:17
- Draft comment:
Enhanced validators to handle str, float, and int. Consider extracting a common conversion helper and watch for potential float precision issues. - Reason this comment was not posted:
Confidence changes required:33%<= threshold85%None
3. tests/llm/test_gemini/test_multimodal_content.py:15
- Draft comment:
Typographical error: 'test_audio_compatability_list' appears to contain a typo. Consider renaming it to 'test_audio_compatibility_list'. - Reason this comment was not posted:
Comment was on unchanged code.
Workflow ID: wflow_uJcletBHcPP1nYpE
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
There was a problem hiding this comment.
Important
Looks good to me! 👍
Reviewed 2d713e4 in 56 seconds. Click for details.
- Reviewed
30lines of code in1files - Skipped
0files when reviewing. - Skipped posting
3draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. tests/llm/test_cohere/test_json_schema.py:11
- Draft comment:
Good change: using Field(description=…) prevents misinterpretation as a default value. - Reason this comment was not posted:
Confidence changes required:0%<= threshold85%None
2. tests/llm/test_cohere/test_json_schema.py:54
- Draft comment:
Updated Field usage for ValidatedUser improves schema documentation. - Reason this comment was not posted:
Confidence changes required:0%<= threshold85%None
3. tests/llm/test_cohere/test_json_schema.py:80
- Draft comment:
Added max_retries=3 to test_parse_validated_user_sync to explicitly handle retries. - Reason this comment was not posted:
Confidence changes required:0%<= threshold85%None
Workflow ID: wflow_PME0Qx3UjJQGeGJ3
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
Fixed several type checking issues to reduce diagnostics: 1. batch/models.py: Import TypeAlias from typing_extensions for Python 3.9-3.11 compatibility 2. bedrock/client.py: Use BaseClient instead of boto3.client as type annotation 3. bedrock/utils.py: Add type ignore comments for requests.exceptions (type stub issue) 4. test_anthropic/test_reasoning.py: Make reasoning test more reliable with temperature=0 and better prompt Reduces type diagnostics from 6413 to 61 (remaining are in xai provider). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
156a4d8 to
2843267
Compare
There was a problem hiding this comment.
Important
Looks good to me! 👍
Reviewed 156a4d8 in 2 minutes and 44 seconds. Click for details.
- Reviewed
1260lines of code in11files - Skipped
0files when reviewing. - Skipped posting
7draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. reproduce_1829.py:90
- Draft comment:
Consider moving the import of 'anthropic' to the top-level. Currently, 'anthropic' is used in the function before it's imported in the main block, which may lead to a NameError in some contexts. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 85% This is a reproduction script meant to be run directly (as evidenced by theif __name__ == "__main__"pattern and the docstring). The comment is technically correct - there is a potential NameError if the function is called outside the main block. However, reproduction scripts are typically meant to be run as standalone scripts, not imported as modules. The rules say "Do NOT comment unless there is clearly a code change required" and to avoid comments on things that are "obvious or unimportant." For a reproduction script, this import pattern is common and acceptable. The comment is also somewhat speculative ("may lead to a NameError in some contexts") rather than definitively identifying a problem in the intended use case. While the comment is technically correct about the potential NameError, reproduction scripts are typically run as standalone files where theif __name__ == "__main__"block executes. The author likely structured it this way intentionally for a reproduction script. This might be an overly pedantic comment for this type of file. Even though this is a reproduction script, having the import at the top level would be better practice and wouldn't hurt anything. However, the comment uses hedging language ("may lead to") and mentions "some contexts" which makes it speculative. For the intended use case (running as a script), this works fine and would be caught immediately if there were an issue. This comment is technically correct but is about a non-issue for a reproduction script that's meant to be run directly. The speculative nature ("may lead to... in some contexts") and the fact that this is a reproduction script (not production code) make this comment not particularly useful. It would be caught immediately if run incorrectly.
2. tests/llm/test_anthropic/test_reasoning.py:29
- Draft comment:
Add an assertion error message for clarity on failure and consider using a tolerance (e.g. math.isclose) for float comparisons. - Reason this comment was not posted:
Confidence changes required:80%<= threshold85%None
3. SOLUTION_1303_1305.md:1
- Draft comment:
If this markdown file is intended as documentation, ensure that it is added to the mkdocs.yml configuration to keep documentation up-to-date. - Reason this comment was not posted:
Confidence changes required:80%<= threshold85%None
4. examples/syptom-tracker/README.md:1
- Draft comment:
The folder is named 'syptom-tracker' but the content refers to 'Symptom Tracker'. Confirm whether this is a typo; if so, consider renaming the folder to 'symptom-tracker' for consistency. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 85% This comment is asking the author to "confirm" something and "consider" renaming, which violates the rule "Do NOT ask the PR author to confirm their intention, to explain, to double-check things, to ensure the behavior is intended." The comment is phrased as a question/request for confirmation rather than a clear statement that a change is needed. However, there IS a clear typo in the folder name ('syptom' vs 'symptom'), which is objectively wrong and would require a file/folder rename to fix. This is a legitimate issue, but the comment violates the rules by asking for confirmation rather than stating the issue directly. Additionally, the rules state "Do NOT comment on anything that would be obviously caught by the build, such as variable renames, file renames" - though this is about suggesting a rename, not catching one that already happened. The folder name typo is a real issue that should be fixed, and it's not something that would be caught by the build since it's just a folder name. The comment identifies a legitimate problem even if it's phrased as a question. Maybe the phrasing issue isn't severe enough to warrant deletion if the underlying issue is valid. While the typo is real, the comment explicitly asks the author to "confirm" something, which directly violates the rules. The comment should either state definitively that the folder should be renamed or not be made at all. The tentative phrasing ("Confirm whether this is a typo; if so, consider...") makes it speculative and asks for confirmation, both of which are against the rules. The comment should be deleted because it asks the author to confirm their intention and uses tentative language ("consider") rather than making a clear, actionable statement. Even though there is a real typo, the comment violates the rule against asking for confirmation.
5. examples/syptom-tracker/extract_from_transcript.py:47
- Draft comment:
Typographical note: The model name 'gpt-4o' appears here. Is this a typo for 'gpt-4'? - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 0% vs. threshold = 85% 'gpt-4o' is indeed a real OpenAI model (GPT-4 Omni), released by OpenAI. This is not a typo. The comment is incorrect because it assumes 'gpt-4o' is a mistake when it's actually a valid and intentional model choice. The consistent use of 'gpt-4o' across all three function calls in the file further suggests this is intentional. This comment should be deleted as it's factually incorrect and would mislead the PR author. Could 'gpt-4o' be a custom or internal model name that I'm not aware of? Perhaps in some contexts or older versions of the OpenAI API, this wasn't a valid model name, and the comment might have been correct at the time it was made. While it's theoretically possible this could be a custom model, 'gpt-4o' is a well-known OpenAI model (GPT-4 Omni) that has been publicly available. The consistent usage across the file and the fact that this is example code using the OpenAI API strongly indicates this is the legitimate model name. The comment is incorrect. This comment should be deleted. 'gpt-4o' is a legitimate OpenAI model name (GPT-4 Omni), not a typo. The comment is factually incorrect and would confuse the PR author by suggesting they fix something that isn't broken.
6. examples/syptom-tracker/extract_from_transcript.py:88
- Draft comment:
Typographical note: The model name 'gpt-4o' is used again. Please verify if it should be 'gpt-4'. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 0% vs. threshold = 85% This comment is asking the author to "verify" if the model name is correct. This directly violates the rule "Do NOT ask the PR author to confirm their intention, to explain, to double-check things, to ensure the behavior is intended". Additionally, 'gpt-4o' is a legitimate model name (GPT-4 Omni) from OpenAI, so this isn't even identifying a real issue. The author consistently used 'gpt-4o' three times in the file, which suggests it's intentional. This comment should be deleted. Could 'gpt-4o' be a typo that the author made consistently? Perhaps the automated tool has knowledge that 'gpt-4' is more commonly used or appropriate for this use case? Even if 'gpt-4o' were a typo, the comment is phrased as "please verify", which explicitly asks the author to confirm their intention - a clear violation of the rules. Additionally, 'gpt-4o' is a real, valid model name, so there's no evidence this is actually wrong. The consistent usage across the file suggests intentionality. This comment should be deleted because it asks the author to verify/confirm their intention, which violates the review rules. Additionally, 'gpt-4o' is a valid OpenAI model name, so there's no evidence of an actual issue.
7. examples/syptom-tracker/extract_from_transcript.py:126
- Draft comment:
Typographical note: The model name 'gpt-4o' appears here as well. Consider confirming the correct model name. - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =0%<= threshold85%This comment is asking the PR author to confirm the correct model name, which violates the rule against asking for confirmation or verification. It does not provide a specific suggestion or improvement to the code itself.
Workflow ID: wflow_QKKyPvjucV7WVRrd
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
There was a problem hiding this comment.
Important
Looks good to me! 👍
Reviewed 2843267 in 4 minutes and 26 seconds. Click for details.
- Reviewed
90lines of code in4files - Skipped
0files when reviewing. - Skipped posting
4draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. instructor/batch/models.py:9
- Draft comment:
Prefer importing TypeAlias from typing_extensions to support older Python versions. - Reason this comment was not posted:
Confidence changes required:0%<= threshold85%None
2. instructor/providers/bedrock/client.py:14
- Draft comment:
Updating the client type hint to BaseClient improves generality and type safety over boto3.client. - Reason this comment was not posted:
Confidence changes required:0%<= threshold85%None
3. instructor/providers/bedrock/utils.py:189
- Draft comment:
Type ignores added for requests exceptions; consider checking if updated stubs or explicit imports can remove these ignores in future. - Reason this comment was not posted:
Confidence changes required:33%<= threshold85%None
4. tests/llm/test_anthropic/test_reasoning.py:29
- Draft comment:
Consider adding explicit error messages to the assertions to improve debugging when tests fail. - Reason this comment was not posted:
Confidence changes required:33%<= threshold85%None
Workflow ID: wflow_hLgEMCHKl1KSw7nY
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
Important
Migrate CI workflows to use
uvfor dependency management and testing, refactor code for async handling and error improvements, and update tests accordingly.poetrytouvfor dependency management and test execution in.github/workflows/evals.yml,pyright.yml, andscheduled-release.yml.uvinevals.ymlandpyright.yml.typing_extensions.TypeAliasinmodels.py.from_streaming_response_async()initerable.pyandpartial.pyto yield items directly.multimodal.pyusing magic bytes.castandAsyncGeneratorimports inresponse.py.utils.pyandclient.py.requestsexceptions inutils.py.pytest.mark.skipiffor tests requiringOPENAI_API_KEYintest_partial.py.test_reasoning.pyto use deterministic results withtemperature=0.test_json_schema.pyandtest_reasoning.py.test_genai.pyto handle decimal parsing forReceiptandInvoicemodels.update_genai_kwargsintest_utils.py.test_auto_client.pyfor provider availability.This description was created by
for 2843267. You can customize this summary. It will automatically update as commits are pushed.