-
Notifications
You must be signed in to change notification settings - Fork 651
feat(py): Add type safety to Python SDK #4309 #4310
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Add TypeVar generics (InputT, OutputT, ChunkT) to the Action class for improved type inference in flows and tools. - Action[InputT, OutputT, ChunkT] now properly types inputs/outputs - FlowWrapper preserves callable signature for correct return types - Uses typing_extensions for Python 3.10+ compatibility - Adds CI type checking with pyright, mypy, and ty This enables IDE autocomplete and type checking for: - Flow return types: result = await my_flow() -> typed - Tool return types: result = await my_tool() -> typed - Streaming chunks: async for chunk in stream -> typed
Improve public API by consolidating exports: - genkit/__init__.py: Export Genkit, GenkitError, Message, Part, etc. Users can now: from genkit import Genkit, Message, Part - genkit/ai/__init__.py: Add explicit __all__ with Genkit properly exported - genkit/types/__init__.py: - Remove internal types (ActionRunContext, *Wrapper, Constrained) - Add ToolInterruptError for user error handling - Organize exports by category (Message, Document, Generation, etc.) Aligns Python API surface with JS/Go patterns for better DX.
Update internal imports to use specific module paths instead of re-export modules, satisfying basedpyright's reportPrivateImportUsage: - Channel, ensure_async: from genkit.aio.* internal modules - find_free_port_sync: from genkit.web.manager._ports - GenkitSpan, init_telemetry_server_exporter: from genkit.core.trace.* - FormatDef, Formatter: from genkit.blocks.formats.types No behavior change - purely import path updates for stricter type checking.
- Add @OverRide decorator to 35 methods that override parent classes (formats, trace exporters, session stores, web adapters) - Add _ = to ~50 function calls where return values are intentionally ignored (satisfies basedpyright reportUnusedCallResult) This improves type safety by: - Making method overrides explicit (catches typos and broken inheritance) - Documenting intentional ignored return values
This commit adds comprehensive type safety improvements: 1. Output[T] class for type-safe output configuration: - `response = await ai.generate(output=Output(schema=Recipe))` - `response.output` is now typed as `Recipe` 2. GenerateResponseWrapper[T] generic: - The response wrapper is now generic over the output type - Full end-to-end type inference from Output[T] to response.output 3. Fixed reportUnannotatedClassAttribute warnings (196 fixes): - Added type annotations to all class instance attributes - Fixed schema generator to produce ClassVar[ConfigDict] annotations 4. Fixed reportMissingTypeArgument warnings (59 fixes): - Added type arguments to Formatter, Channel, Callable, etc. - Added type arguments to PromptFunction, PromptMetadata - Added type arguments to RetrieverFn, IndexerFn, RerankerFn, etc. 5. Export improvements: - Exported GenerateResponseWrapper from genkit package - Users can now type hint with GenerateResponseWrapper[T] Total warnings fixed: ~340 across 40+ files
1. Fix schema generator to use Field(default=None) instead of Field(None): - Pyright doesn't recognize Field(None) as providing a default value - Changed 70+ occurrences in auto-generated typing.py - Also handles Field(None, alias=...) pattern 2. Fix ParamSpec issues in tool decorator (_registry.py): - Added pyright: ignore comments for dynamic dispatch code - ParamSpec can't be statically verified with runtime arg inspection 3. Fix callable check in prompt.py: - Added callable(factory) guard before calling dynamic factory Total reportCallIssue fixes: 39 → 0
1. tracing.py: Fixed actual bug where `span` could be unbound - Moved GenkitSpan creation before try block - Previously would crash if GenkitSpan() threw in except handler 2. _info.py: Fixed optional psutil import pattern - Changed from HAS_PSUTIL flag to `psutil = None` pattern - Pyright can now track the None check for type narrowing 3. typing.py: Fixed optional litestar/starlette imports - Changed from HAVE_* flags to `module = None` pattern - Pyright can now verify conditional type aliases Total reportPossiblyUnboundVariable fixes: 38 → 0
Fixed 25 reportUnusedParameter warnings by prefixing unused parameters with `_` to indicate they are intentionally unused. Files modified: - _registry.py: kwargs in flow wrappers - generate.py: preamble, raw_request, model, registry - prompt.py: dir parameter - retriever.py: ctx in wrapper functions - _action.py: telemetry_labels, input_spec - _util.py: chunk in noop callback - flows.py: request in health_check - reflection.py: encoding, request params, action_input - testing.py: ctx in model_fn - _ports.py: host parameter - signals.py: frame in signal handler
Added super().__init__() calls (3 fixes): - GenerationResponseError: pass message to Exception base - ToolInterruptError: call Exception.__init__ - RedactedSpan: call ReadableSpan.__init__ Suppressed reportUnreachable for intentional code (13 fixes): - Python 3.10 compatibility branches (sys.version_info < 3.11) - Defensive null checks that type narrowing makes unreachable - Exhaustive match/isinstance patterns with fallback branches
Fixed 43 warnings across 7 categories: reportImplicitStringConcatenation (3): - Added explicit '+' for multi-line f-string concatenation reportInvalidCast (3): - Used cast(object, x) as intermediary for MatchableAction casts reportUnsupportedDunderAll (6): - Converted __name__ to literal strings in __all__ exports reportUnnecessaryIsInstance (6): - Suppressed defensive runtime type checks reportUnnecessaryComparison (6): - Suppressed defensive null checks that type narrowing makes unnecessary reportPrivateUsage (13): - Suppressed internal access to _private members within SDK code reportGeneralTypeIssues (6): - Fixed dict unpacking with proper isinstance checks - Suppressed complex TypeVar issues in FlowWrapper
Phase 3a: Create typed Logger protocol wrapper for structlog - Added genkit.core.logging module with Logger protocol and get_logger() - Updated 17 files to use typed logger instead of structlog.get_logger() - Export Logger and get_logger from genkit.core - Eliminates ~100 reportAny warnings from structlog's dynamic methods Phase 3b: Add typed action lookup methods to Registry - Added resolve_retriever(), resolve_embedder(), resolve_reranker(), resolve_model(), resolve_evaluator() methods with proper type casts - Updated callers in _aio.py, generate.py, reranker.py to use typed lookups - Eliminates ~10 reportAny warnings from dynamic registry lookups Also includes: - Design docs for Phase 3: phase3-typed-internals.md - Implementation tasks: phase3-typed-internals-tasks.md - Updated mock registry in embedding_test.py for new method Total reduction: ~110 reportAny warnings eliminated
- Logger protocol: Use `object` for **kwargs and `None` return type instead of `Any` - eliminates 35+ warnings - Loop utilities: Make run_async, iter_over_async, run_loop generic with TypeVar instead of Any - eliminates 11 warnings These changes improve type safety while maintaining compatibility with structlog and asyncio patterns.
- Use typed logger (get_logger) instead of structlog.get_logger - Fix ActionRunContext to be Optional and add None checks - Add type arguments to bare dict return types - Prefix unused parameters with underscore - Fix implicit string concatenation - Add pyright ignore for Python version compatibility check Reduces from 6 errors + 30 warnings to 0 errors + 14 warnings. Remaining warnings are from namespace package resolution for plugins.
Common fixes applied: - Change `ctx: ActionRunContext = None` to `ctx: ActionRunContext | None = None` - Add null checks before accessing ctx.is_streaming and ctx.send_chunk - Add type arguments to bare `dict` and `list` return types - Prefix unused parameters with underscore - Fix relative imports in evaluator-demo - Use typed logger (get_logger) in chat-demo - Fix ActionRunContext import path in anthropic-hello Reduces total errors across samples from 48+ to ~24. Remaining errors are complex type issues (method overrides, Streamlit types, etc.) that need deeper investigation.
Shows the Output[T] pattern for getting typed responses from ai.generate():
response = await ai.generate(
prompt='...',
output=Output(schema=Recipe), # The magic!
)
response.output # Typed as Recipe, not Any!
This enables full IDE autocomplete on response.output fields.
- Replace structlog.get_logger with genkit.core.logging.get_logger in all 18 samples for proper type hints - Fix ctx null checks in compat-oai-hello - Make pyrightconfig.json portable (relative venvPath) - Add reportMissingTypeStubs: false to suppress harmless warnings
… instead
BREAKING CHANGE: The `output_schema` parameter has been removed from
`ai.generate()` and `ai.generate_stream()`. Use `output=Output(schema=YourSchema)`
instead, which provides full type inference on `response.output`.
Before:
response = await ai.generate(prompt='...', output_schema=Recipe)
result = cast(Recipe, response.output) # Manual cast needed
After:
response = await ai.generate(prompt='...', output=Output(schema=Recipe))
result = response.output # Typed as Recipe automatically!
This aligns with the JS SDK which uses `output: { schema: ... }`.
Updated all samples to use the new pattern.
Channel[T] is now Channel[T, R] where: - T = type of items streamed through the channel - R = type of the close future result This fixes the type mismatch where streaming chunks (GenerateResponseChunkWrapper) and the final response (GenerateResponseWrapper) are different types.
Key files fixed: - aio/channel.py: Fixed unbound 'pending' variable in timeout handler, fixed set_exception type narrowing with walrus operator - ai/_aio.py: Added pyright ignores for list invariance (Document vs DocumentData) - blocks/generate.py: Added explicit type params to Action[Any, Any, Any] - blocks/model.py: Fixed message override with ignore, added None check - blocks/prompt.py: Fixed PromptMetadata dict typing, added ignores for dynamic Action attributes (_executable_prompt, _async_factory) - core/action/_action.py: Fixed telemetry_labels parameter name, added Channel type params, fixed stream callback type - core/flows.py: Fixed 'eerror' typo to 'aerror', added return type ignore - session/chat.py: Suppressed import cycle warning (TYPE_CHECKING guarded) Reduced errors from 50+ to 0 in the 10 key genkit files.
Changes: - Remove unused import (EmbedResponse) - Fix unused call result (task.cancel()) - Fix import locations (Action, ActionKind) - Remove unnecessary casts and isinstance checks - Add pyright config to suppress intentional Any usage: - reportExplicitAny, reportAny (intentional dynamic typing) - reportUnknown* (external library types) Result: 0 errors, 0 warnings on the 8 key genkit files.
ExecutablePrompt is now generic: ExecutablePrompt[OutputT]
When defining a prompt with output=Output(schema=T), the returned
prompt is typed as ExecutablePrompt[T], and all calls return
GenerateResponseWrapper[T] with typed .output property.
Example:
```python
class Recipe(BaseModel):
name: str
ingredients: list[str]
recipe_prompt = ai.define_prompt(
name='recipe',
prompt='Create a recipe for {food}',
output=Output(schema=Recipe), # Type captured here
)
response = await recipe_prompt({'food': 'pizza'})
response.output.name # ✓ Typed as str, autocomplete works!
```
Changes:
- Make ExecutablePrompt[OutputT] generic
- Make GenerateStreamResponse[OutputT] generic
- Add overloads to define_prompt() for type inference
- Add overloads to GenkitRegistry.define_prompt()
- Add typing tests for ExecutablePrompt
This matches the JS SDK pattern where the output type is captured
at prompt definition time.
Add comprehensive examples showing all Output fields with define_prompt: - Basic usage with just schema - Full usage with format, content_type, instructions, constrained - Streaming example - Type checking demo Files: - typing-manual-test/main.py: added full Output fields example - typing-evaluation/src/typed_prompt_example.py: new comprehensive example
ExecutablePrompt is now ExecutablePrompt[InputT, OutputT], matching JS SDK.
When defining a prompt with both input=Input(schema=I) and output=Output(schema=O),
the returned prompt is typed as ExecutablePrompt[I, O]:
- Input is type-checked when calling the prompt
- Output is typed on response.output
Example:
```python
class RecipeInput(BaseModel):
dish: str
servings: int
class Recipe(BaseModel):
name: str
ingredients: list[str]
recipe_prompt = ai.define_prompt(
name='recipe',
prompt='Create a recipe for {dish} serving {servings}',
input=Input(schema=RecipeInput), # ← Input typed!
output=Output(schema=Recipe), # ← Output typed!
)
# Input type-checked:
response = await recipe_prompt(RecipeInput(dish='pizza', servings=4))
# response = await recipe_prompt({'dish': 'pizza'}) # ❌ Type error!
response.output.name # ✓ Typed as str
```
Changes:
- Add Input[T] class in _aio.py
- Update ExecutablePrompt to Generic[InputT, OutputT]
- Update __call__ and stream to accept InputT
- Add 4 overloads to define_prompt for all input/output combinations
- Convert Pydantic models to dicts for template rendering
- Export Input from genkit.ai
- Update typed_prompt_example.py with full examples
JS/Python parity table:
| Feature | JS | Python |
|---------------------|-----|--------|
| Prompt typed input | ✅ | ✅ |
| Prompt typed output | ✅ | ✅ |
| generate output | ✅ | ✅ |
- Add generate_stream() overloads for typed Output[T] - Update docs and tests to use Output() instead of output_schema - Add streaming output type verification test - Fix evaluators plugin to use new Output API - Remove internal design docs and test samples
Summary of ChangesHello @huangjeff5, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly upgrades the Python SDK's type safety, providing a more robust and developer-friendly experience. By introducing generic Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Ignored Files
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This is an excellent and extensive pull request that brings comprehensive type safety to the Python SDK. The introduction of generic types for core components like Action, ExecutablePrompt, and FlowWrapper, along with the new Input[T] and Output[T] classes, is a significant improvement for developer experience and code correctness. The changes are consistently applied across the codebase, including updates to samples and the addition of typing verification tests. The new typed logger and pyright configuration are also great additions. I have one minor suggestion for code clarity.
- Fix streaming tests to use .response property (ActionResponse change) - Fix RedactedSpan by removing incorrect super().__init__() call - Fix Channel TypeVar default for backward compatibility - Fix Channel timeout to not cancel external close_future - Add per-file-ignores for typing tests in pyproject.toml - Fix missing docstring args in _action.py and _util.py - Fix imports and formatting to pass lint checks
Consolidates version-specific imports (StrEnum, override) into a single
compatibility module to eliminate code duplication and improve maintainability.
Changes:
- Created genkit/core/_compat.py with centralized version checks
- Refactored StrEnum imports (was duplicated in 4 files)
- Refactored override decorator imports (was duplicated in 12 files)
- Updated schema generator to use _compat module
- All compatibility logic now in one place for easier updates
Summary
This PR adds full type safety to the Python SDK to match what we have in JS. The core goal: when you call
generate(),define_prompt(), or use a flow, the return types should be known at dev time so your IDE can autocomplete and catch errors before runtime.What was broken
Type information was getting lost at key boundaries:
What this PR does
Generic Action class
Made
Actiongeneric so types flow through:Typed Output[T] for generate()
Typed generate_stream()
Same pattern works for streaming:
Typed prompts with Input[T] and Output[T]
Prompts (including dotprompt files) get full type safety:
Typed flows
Flows preserve types through the decorator:
Breaking change
output_schemaremoved fromgenerate(). Useoutput=Output(schema=...)instead:Other changes
Anyusage across codebase@overridedecorators where neededTesting
tests/typing/- all pass