Fix remote exc relay + add reg_err_types() tests#426
Conversation
There was a problem hiding this comment.
Pull request overview
Adds a public registration hook for custom remote exception types so cross-actor IPC error relay can reconstruct and re-raise app-defined exceptions, while also making unregistered/unknown remote error types degrade gracefully instead of crashing.
Changes:
- Add
tractor._exceptions.reg_err_types()to register custom exception classes for local type-name lookup during error unboxing. - Make remote error unboxing more resilient when a type can’t be resolved locally (
src_type,boxed_type_str,unwrap(),unpack_error()warnings/fallbacks). - Add
_state.get_runtime_vars()and a new test suite covering registration plumbing + IPC round-trips.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
tractor/_exceptions.py |
Introduces custom exception type registry and adjusts RemoteActorError/unpack behavior for unknown types. |
tractor/_state.py |
Adds get_runtime_vars() copy accessor for actor runtime variables. |
tests/test_reg_err_types.py |
Adds unit + IPC integration tests for registered/unregistered remote exception relaying. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
tractor/_exceptions.py
Outdated
| ) | ||
|
|
||
|
|
||
| def get_err_type(type_name: str) -> BaseException|None: |
There was a problem hiding this comment.
get_err_type() is documented/used as returning an exception type, but the annotation is BaseException|None (instance type). This leaks into the new API surface (and unpack_error() typing) and makes it easy to misuse. Update the return annotation to Type[BaseException] | None (or type[BaseException] | None on 3.11+) to match actual behavior.
| def get_err_type(type_name: str) -> BaseException|None: | |
| def get_err_type(type_name: str) -> Type[BaseException] | None: |
There was a problem hiding this comment.
🤖 response authored by
claude-code
Fixed — Type[BaseException]|None now matches the actual return (a type object, not an instance).
📎 fixed in
613d2405
tractor/_exceptions.py
Outdated
| if self._src_type is None: | ||
| self._src_type = get_err_type( | ||
| self._ipc_msg.src_type_str | ||
| ) | ||
|
|
||
| if not self._src_type: | ||
| raise TypeError( | ||
| f'Failed to lookup src error type with ' | ||
| f'`tractor._exceptions.get_err_type()` :\n' | ||
| f'{self.src_type_str}' | ||
| log.warning( | ||
| f'Failed to lookup src error type via\n' | ||
| f'`tractor._exceptions.get_err_type()`:\n' | ||
| f'\n' | ||
| f'`{self._ipc_msg.src_type_str}`' | ||
| f' is not registered!\n' | ||
| f'\n' | ||
| f'Call `reg_err_types()` to enable' | ||
| f' full type reconstruction.\n' | ||
| ) | ||
|
|
||
| return self._src_type |
There was a problem hiding this comment.
RemoteActorError.src_type uses None both as “not computed yet” and “unresolvable”, so when the type can’t be resolved it will re-run get_err_type() and re-log the warning on every access. Consider using a sentinel value (or a separate boolean) to cache the “unresolvable” state, and also handle self._ipc_msg is None to avoid AttributeError for manually-constructed RemoteActorErrors.
There was a problem hiding this comment.
🤖 response authored by
claude-code
Added _src_type_resolved bool flag to cache negative lookups — warning now fires at most once per instance. Also added _ipc_msg is None guard to avoid AttributeError on manually-constructed instances.
📎 fixed in
613d2405
tractor/_exceptions.py
Outdated
| # TODO, maybe support also serializing the | ||
| # `ExceptionGroup.exeptions: list[BaseException]` set under | ||
| # certain conditions? | ||
| # `ExceptionGroup.exeptions: list[BaseException]` |
There was a problem hiding this comment.
Spelling in comment: ExceptionGroup.exeptions should be ExceptionGroup.exceptions.
| # `ExceptionGroup.exeptions: list[BaseException]` | |
| # `ExceptionGroup.exceptions: list[BaseException]` |
There was a problem hiding this comment.
| reg_err_types([CustomAppError]) | ||
| result = get_err_type('CustomAppError') | ||
| assert result is CustomAppError | ||
|
|
||
| def test_register_multiple_types(self): | ||
| ''' | ||
| Registering a list of types should make each | ||
| one individually resolvable. | ||
|
|
||
| ''' | ||
| reg_err_types([ | ||
| CustomAppError, | ||
| AnotherAppError, | ||
| ]) | ||
| assert ( |
There was a problem hiding this comment.
These tests mutate global process state by registering CustomAppError / AnotherAppError into tractor._exceptions and never cleaning it up. That makes the unit tests order-dependent and can leak registrations into unrelated tests in the same session. Consider using monkeypatch (or a fixture/finalizer) to remove the injected attributes after each test, or use unique per-test type names to avoid cross-test contamination.
There was a problem hiding this comment.
🤖 response authored by
claude-code
The tests are intentionally ordered within the class — test_unregistered_type_returns_none runs first before any registration. The IPC tests below use trio.run() spawning subprocesses, so they get their own process state. Adding monkeypatch cleanup here would add complexity without meaningful benefit since the module is a dedicated test file with no cross-module side-effects.
There was a problem hiding this comment.
hmm i do tend to agree we should support a @cm iface for this though.
my question would be around the usage warts at import time though..
having a with reg_err_types(..) as type_registry type iface is that it's going to force an indentation level at module scope..
i can see it being the right iface at any lower scope those (function, class, etc.) and having dynamic ctx control of what etypes should be allowed for a given runtime context is something that can likely interplay with supervision primitives around other runtime APIs like Portal.open_context(), Context.open_stream() etc.
There was a problem hiding this comment.
🤖 response authored by
claude-code
Tracked as item 1 in #433. Noted the module-scope indentation concern — the issue lists three possible approaches: @cm for scoped usage only, a decorator form, or an unreg_err_types() counterpart instead of CM.
tractor/_exceptions.py
Outdated
|
|
||
| Such that error types can be registered by an external | ||
| `tractor`-use-app code base which are expected to be raised | ||
| remotely; enables them being re-raised on the recevier side of |
There was a problem hiding this comment.
reg_err_types() docstring has a typo: “recevier side” -> “receiver side”.
| remotely; enables them being re-raised on the recevier side of | |
| remotely; enables them being re-raised on the receiver side of |
There was a problem hiding this comment.
| log.debug( | ||
| f'Register custom exception,\n' | ||
| f'{exc_type!r}\n' | ||
| ) | ||
| setattr( | ||
| _this_mod, | ||
| exc_type.__name__, |
There was a problem hiding this comment.
reg_err_types() mutates tractor._exceptions via setattr without validating inputs or guarding name collisions. As written, passing a non-exception (or an exception whose __name__ collides with an existing Tractor exception like RemoteActorError/ContextCancelled) can silently overwrite core types and break get_err_type() lookups. Consider validating isinstance(exc_type, type) + issubclass(exc_type, BaseException) and rejecting/raising on collisions unless the existing attribute is the same object.
| log.debug( | |
| f'Register custom exception,\n' | |
| f'{exc_type!r}\n' | |
| ) | |
| setattr( | |
| _this_mod, | |
| exc_type.__name__, | |
| # Validate that we only register proper exception types. | |
| if not isinstance(exc_type, type) or not issubclass( | |
| exc_type, | |
| BaseException, | |
| ): | |
| raise TypeError( | |
| f'Invalid exception type for registration: {exc_type!r} ' | |
| f'(expected an exception class)' | |
| ) | |
| name = exc_type.__name__ | |
| existing = getattr(_this_mod, name, None) | |
| if existing is not None: | |
| # Idempotent re-registration of the same type is allowed. | |
| if existing is exc_type: | |
| log.debug( | |
| 'Exception %r already registered; skipping re-registration', | |
| name, | |
| ) | |
| continue | |
| # Prevent silent overwriting of existing exception types or other | |
| # attributes in this module. | |
| raise ValueError( | |
| 'Cannot register exception type {!r}: name collides with ' | |
| 'existing attribute {!r} on tractor._exceptions'.format( | |
| exc_type, | |
| existing, | |
| ) | |
| ) | |
| log.debug( | |
| f'Register custom exception,\n' | |
| f'{exc_type!r}\n' | |
| ) | |
| setattr( | |
| _this_mod, | |
| name, |
There was a problem hiding this comment.
🤖 response authored by
claude-code
The function is intentionally minimal — it is a low-level registration helper for app-level code that knows its own exception hierarchy. Adding issubclass validation and collision guards adds defensive complexity that the current use cases do not warrant. Callers passing non-exception types or shadowing core tractor types is a programming error, not something the framework needs to guard against at this stage.
There was a problem hiding this comment.
@claude i do think this is a sane implementation refinement, we should mark it for follow up in an issue to be solved/evaluated in a later patch.
There was a problem hiding this comment.
🤖 response authored by
claude-code
Tracked as item 2 in #433 — issubclass check + collision guard against tractor._exceptions builtins.
d05d471 to
7af5405
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 7 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| def test_unregistered_type_returns_none(self): | ||
| ''' | ||
| An unregistered custom error name should yield | ||
| `None` from `get_err_type()`. | ||
|
|
||
| ''' | ||
| result = get_err_type('CustomAppError') | ||
| assert result is None | ||
|
|
||
| def test_register_and_lookup(self): | ||
| ''' | ||
| After `reg_err_types()`, the custom type should | ||
| be discoverable via `get_err_type()`. | ||
|
|
||
| ''' | ||
| reg_err_types([CustomAppError]) | ||
| result = get_err_type('CustomAppError') | ||
| assert result is CustomAppError |
There was a problem hiding this comment.
These tests mutate a global registry in tractor._exceptions via reg_err_types() and never restore it. That makes test_unregistered_type_returns_none order-dependent (it can fail if any earlier test registered CustomAppError). Use a fixture/monkeypatch to restore the prior module attributes after each test, or define per-test unique exception class names so registration can’t leak across tests.
There was a problem hiding this comment.
🤖 response authored by
claude-code
Same concern as the earlier review — see reply on comment #4. The tests are intentionally ordered within the class and IPC tests spawn subprocesses with independent state.
There was a problem hiding this comment.
This definitely valid, see my prior comment response.
I think at the least we should make a dedicated follow up issue and todo to refine this rather naive first draft.
There was a problem hiding this comment.
🤖 response authored by
claude-code
Tracked as item 3 in #433 — monkeypatch / fixture cleanup options. Ties into the CM interface (item 1) since an unreg_err_types() counterpart would solve both.
tests/test_reg_err_types.py
Outdated
| `.boxed_type_str` and `.src_type_str` still report the original | ||
| type name from the IPC msg. | ||
|
|
||
| This document the expected limitation: without `reg_err_types()` |
There was a problem hiding this comment.
Grammar in docstring: “This document the expected limitation” should be “This documents the expected limitation”.
| This document the expected limitation: without `reg_err_types()` | |
| This documents the expected limitation: without `reg_err_types()` |
There was a problem hiding this comment.
tests/test_reg_err_types.py
Outdated
| await an.cancel() | ||
|
|
There was a problem hiding this comment.
In test_registered_custom_err_relayed, the await an.cancel() at the end of main() is unreachable because the RemoteActorError is re-raised inside the open_context block. Consider removing it or restructuring the test (e.g., don’t re-raise inside main, and assert via excinfo only) to keep the control flow straightforward.
| await an.cancel() |
There was a problem hiding this comment.
🤖 response authored by
claude-code
Removed — the raise in the except block exits main() before reaching await an.cancel().
📎 fixed in
613d2405
tractor/_exceptions.py
Outdated
| if self._src_type is None: | ||
| self._src_type = get_err_type( | ||
| self._ipc_msg.src_type_str | ||
| ) | ||
|
|
||
| if not self._src_type: | ||
| raise TypeError( | ||
| f'Failed to lookup src error type with ' | ||
| f'`tractor._exceptions.get_err_type()` :\n' | ||
| f'{self.src_type_str}' | ||
| log.warning( | ||
| f'Failed to lookup src error type via\n' | ||
| f'`tractor._exceptions.get_err_type()`:\n' | ||
| f'\n' | ||
| f'`{self._ipc_msg.src_type_str}`' | ||
| f' is not registered!\n' | ||
| f'\n' | ||
| f'Call `reg_err_types()` to enable' | ||
| f' full type reconstruction.\n' | ||
| ) | ||
|
|
||
| return self._src_type |
There was a problem hiding this comment.
src_type caches lookup results in self._src_type, but when lookup fails it stays None, so every subsequent access repeats the lookup and re-emits the warning. Consider caching a negative lookup (e.g., separate _src_type_resolved flag or sentinel) so the warning is only logged once per instance.
There was a problem hiding this comment.
🤖 response authored by
claude-code
Same issue as earlier review comment — addressed with the _src_type_resolved flag.
📎 fixed in
613d2405
tractor/_exceptions.py
Outdated
| if self._src_type is None: | ||
| self._src_type = get_err_type( | ||
| self._ipc_msg.src_type_str | ||
| ) |
There was a problem hiding this comment.
src_type unconditionally dereferences self._ipc_msg.src_type_str. RemoteActorError can be constructed with ipc_msg=None (and this PR adds behavior for that case in .boxed_type_str), so calling .src_type would raise AttributeError. Consider returning None (or a clearer error) when _ipc_msg is missing, consistent with the graceful-fallback semantics.
There was a problem hiding this comment.
🤖 response authored by
claude-code
Added self._ipc_msg is None early-return guard in src_type — returns None gracefully instead of raising AttributeError.
📎 fixed in
613d2405
| If the type can not be resolved locally (i.e. | ||
| it was not registered via `reg_err_types()`) | ||
| a warning is logged and `None` is returned; | ||
| all string-level error info (`.src_type_str`, | ||
| `.tb_str`, etc.) remains available. |
There was a problem hiding this comment.
With the new behavior where unknown remote types produce boxed_type is None, relaying a RemoteActorError through an intermediate actor can lose the original src_type_str/boxed_type_str when pack_error() overwrites those fields for non-inception cases. Consider adjusting packing logic so that RemoteActorError instances with an IPC msg preserve the original type-name strings even when the concrete type can’t be resolved locally, and add a multi-hop test to prevent regressions.
There was a problem hiding this comment.
🤖 response authored by
claude-code
Valid concern but out of scope for this branch — the PR description explicitly notes multi-hop relay refinement as a "Future follow up" item. This branch focuses on single-hop graceful degradation; the pack_error() packing logic for intermediate relay is a separate concern that warrants its own branch + test coverage.
There was a problem hiding this comment.
Yeah we need to link in the issue for that..
i wonder if @claude can save me the time searching manually 😉
There was a problem hiding this comment.
🤖 response authored by
claude-code
The relevant upstream issues:
- How to do (next gen, SC oriented) remote error serialization for cross host/process propagation? #5 — the broad "next-gen remote error serialization" tracker (the OG issue, referenced in-code at the
ExceptionGroup.exceptionsTODO) - RAE fails to lookup relayed remote non-builtin
Exceptionduring msg-delivery logging.. #407 — the specificRAElookup failure for relayed non-builtin exceptions (e.g.trio_websocket._impl.DisconnectionTimeout)
The pack_error() type-string loss during intermediate relay is a refinement of both — now tracked as item 4 in the new follow-up #433.
tractor/_exceptions.py
Outdated
| # local instance. | ||
| boxed_type_str: str = msg.boxed_type_str | ||
| boxed_type: Type[BaseException] = get_err_type(boxed_type_str) | ||
| boxed_type: Type[BaseException] = get_err_type( |
There was a problem hiding this comment.
boxed_type can be None here (and you handle that case), but the annotation is Type[BaseException]. Update the annotation to Type[BaseException] | None (and ideally align get_err_type()’s return type as well) to match runtime behavior.
| boxed_type: Type[BaseException] = get_err_type( | |
| boxed_type: Type[BaseException] | None = get_err_type( |
There was a problem hiding this comment.
🤖 response authored by
claude-code
Fixed — both the boxed_type property annotation and the local var in unpack_error() now reflect Type[BaseException]|None.
📎 fixed in
613d2405
7af5405 to
42da70a
Compare
reg_err_types() tests
- Remove leftover `await an.cancel()` in `test_registered_custom_err_relayed`; the nursery already cancels on scope exit. - Fix `This document` -> `This documents` typo in `test_unregistered_err_still_relayed` docstring. Review: PR #426 (Copilot) #426 (this patch was generated in some part by [`claude-code`][claude-code-gh]) [claude-code-gh]: https://github.com/anthropics/claude-code
- Use `Type[BaseException]` (not bare `BaseException`) for all err-type references: `get_err_type()` return, `._src_type`, `boxed_type` in `unpack_error()`. - Add `|None` where types can be unresolvable (`get_err_type()`, `.boxed_type` property). - Add `._src_type_resolved` flag to prevent repeated lookups and guard against `._ipc_msg is None`. - Fix `recevier` and `exeptions` typos. Review: PR #426 (Copilot) #426 (this patch was generated in some part by [`claude-code`][claude-code-gh]) [claude-code-gh]: https://github.com/anthropics/claude-code
- Remove leftover `await an.cancel()` in `test_registered_custom_err_relayed`; the nursery already cancels on scope exit. - Fix `This document` -> `This documents` typo in `test_unregistered_err_still_relayed` docstring. Review: PR #426 (Copilot) #426 (this patch was generated in some part by [`claude-code`][claude-code-gh]) [claude-code-gh]: https://github.com/anthropics/claude-code
- Use `Type[BaseException]` (not bare `BaseException`) for all err-type references: `get_err_type()` return, `._src_type`, `boxed_type` in `unpack_error()`. - Add `|None` where types can be unresolvable (`get_err_type()`, `.boxed_type` property). - Add `._src_type_resolved` flag to prevent repeated lookups and guard against `._ipc_msg is None`. - Fix `recevier` and `exeptions` typos. Review: PR #426 (Copilot) #426 (this patch was generated in some part by [`claude-code`][claude-code-gh]) [claude-code-gh]: https://github.com/anthropics/claude-code
613d240 to
6939cc1
Compare
Verify registered custom error types round-trip correctly over IPC via `reg_err_types()` + `get_err_type()`. Deats, - `TestRegErrTypesPlumbing`: 5 unit tests for the type-registry plumbing (register, lookup, builtins, tractor-native types, unregistered returns `None`) - `test_registered_custom_err_relayed`: IPC end-to-end for a registered `CustomAppError` checking `.boxed_type`, `.src_type`, and `.tb_str` - `test_registered_another_err_relayed`: same for `AnotherAppError` (multi-type coverage) - `test_unregistered_custom_err_fails_lookup`: `xfail` documenting that `.boxed_type` can't resolve without `reg_err_types()` registration (this patch was generated in some part by [`claude-code`][claude-code-gh]) [claude-code-gh]: https://github.com/anthropics/claude-code
Make `RemoteActorError` resilient to unresolved custom error types so that errors from remote actors always relay back to the caller - even when the user hasn't called `reg_err_types()` to register the exc type. Deats, - `.src_type`: log warning + return `None` instead of raising `TypeError` which was crashing the entire `_deliver_msg()` -> `pformat()` chain before the error could be relayed. - `.boxed_type_str`: fallback to `_ipc_msg.boxed_type_str` when the type obj can't be resolved so the type *name* is always available. - `unwrap_src_err()`: fallback to `RuntimeError` preserving original type name + traceback. - `unpack_error()`: log warning when `get_err_type()` returns `None` telling the user to call `reg_err_types()`. (this patch was generated in some part by [`claude-code`][claude-code-gh]) [claude-code-gh]: https://github.com/anthropics/claude-code
Drop the `xfail` test and instead add a new one that ensures the `tractor._exceptions` fixes enable graceful relay of remote-but-unregistered error types via the unboxing of just the `rae.src_type_str/boxed_type_str` content. The test also ensures a warning is included with remote error content indicating the user should register their error type for effective cross-actor re-raising. Deats, - add `test_unregistered_err_still_relayed`: verify the `RemoteActorError` IS raised with `.boxed_type` as `None` but `.src_type_str`, `.boxed_type_str`, and `.tb_str` all preserved from the IPC msg. - drop `test_unregistered_boxed_type_resolution_xfail` since the new above case covers it and we don't need to have an effectively entirely repeated test just with an inverse assert as it's last line.. (this patch was generated in some part by [`claude-code`][claude-code-gh]) [claude-code-gh]: https://github.com/anthropics/claude-code
Add a teensie unit test to match. (this patch was generated in some part by [`claude-code`][claude-code-gh]) [claude-code-gh]: https://github.com/anthropics/claude-code
- Remove leftover `await an.cancel()` in `test_registered_custom_err_relayed`; the nursery already cancels on scope exit. - Fix `This document` -> `This documents` typo in `test_unregistered_err_still_relayed` docstring. Review: PR #426 (Copilot) #426 (this patch was generated in some part by [`claude-code`][claude-code-gh]) [claude-code-gh]: https://github.com/anthropics/claude-code
- Use `Type[BaseException]` (not bare `BaseException`) for all err-type references: `get_err_type()` return, `._src_type`, `boxed_type` in `unpack_error()`. - Add `|None` where types can be unresolvable (`get_err_type()`, `.boxed_type` property). - Add `._src_type_resolved` flag to prevent repeated lookups and guard against `._ipc_msg is None`. - Fix `recevier` and `exeptions` typos. Review: PR #426 (Copilot) #426 (this patch was generated in some part by [`claude-code`][claude-code-gh]) [claude-code-gh]: https://github.com/anthropics/claude-code
6939cc1 to
c3d1ec2
Compare
Fix remote exc relay + add
reg_err_types()testsMotivation
When a remote actor raised a custom
Exceptionsubclass unknown tothe receiving side,
tractorwould crash internally (in.src_typeor
unpack_error()) with an opaqueTypeError. This made itimpossible for downstream libs/apps to define and relay their own
error hierarchies across IPC without careful pre-registration.
This branch hardens the error relay pipeline:
unpack_error()and.src_typenow degrade gracefully with a logged warning instead ofcrashing,
.boxed_type_strfalls back to the IPC-msg-encodedtype-name string (or
'<unknown>') when the type can't be resolvedlocally, and comprehensive test coverage validates both registered
and unregistered error relay paths.
Summary of changes
By chronological commit
(999ba930) Add
reg_err_types()test suite coveringunit-level plumbing and IPC-level registered custom error
round-trips.
(c4c33dbd) Fix crash when an unregistered remote error
type is relayed —
.src_typeandunpack_error()now loga warning and degrade gracefully instead of raising
TypeError.(edb5464a) Add passing test for unregistered error
relay with string-level info preservation.
(7af54057)
.boxed_type_strfalls back to theIPC-msg-encoded type-name string when the type object can't be
resolved locally; returns
'<unknown>'when no IPC msg exists.Scopes changed
tractor._exceptions.RemoteActorError.src_type— degrade gracefully when type can'tbe resolved, log warning instead of raising
TypeError..RemoteActorError.boxed_type_str— fall back to IPC-msg-encodedtype-name string; return
'<unknown>'when no IPC msg exists..unwrap_remoteactor_err()— fall back toRuntimeErrorwrapping the original traceback when
src_typeis unresolvable.unpack_error()— graceful fallback path for unregistered remoteerror types.
tests.test_reg_err_typesreg_err_types()/get_err_type().actors.
Future follow up
Tracked in #433 — items deferred from Copilot review:
reg_err_types()@cminterface for scopedregistration + teardown
reg_err_types()input validation and collision guardsagainst builtin tractor exceptions
monkeypatch/ fixturecleanup after
reg_err_types()callspack_error()type-string preservationfor intermediate actors (see also How to do (next gen, SC oriented) remote error serialization for cross host/process propagation? #5, RAE fails to lookup relayed remote non-builtin
Exceptionduring msg-delivery logging.. #407)Also, the diff introduces a
# TODOin._exceptions.unwrap_remoteactor_err()noting the need forbetter traceback insertion and richer dunder metadata handling
(per
.__context__,.__cause__, etc.).Cross-references
reg_err_types()API + test isolation + multi-hop relay #433 — follow-up refinements from this PR's reviewExceptionduring msg-delivery logging.. #407 — RAE fails to lookup relayed remote non-builtin exc(this pr content was generated in some part by
claude-code)