diff --git a/funding/nlnet_ngi0_PASTE_READY.txt b/funding/nlnet_ngi0_PASTE_READY.txt new file mode 100644 index 000000000..40ff90694 --- /dev/null +++ b/funding/nlnet_ngi0_PASTE_READY.txt @@ -0,0 +1,228 @@ +============================================================ +NLnet NGI Zero Commons Fund — PASTE-READY SUBMISSION +============================================================ +Tab through the form at https://nlnet.nl/propose/ and paste +each block into the corresponding field. +============================================================ + + +==== Your name ==== + +Tyler Goodlet + + +==== Email address ==== + +goodboy_foss@protonmail.com + + +==== Phone number ==== + +[TODO: FILL IN BEFORE PASTING] + + +==== Organisation ==== + +Independent but with a surrounding (dependent project) community, https://pikers.dev + + +==== Country ==== + +Global but with affiliated community meetups often in CDMX, Mexico. + + +==== Proposal name ==== + +tractor: distributed structured concurrency + + +==== Website / wiki ==== + +https://github.com/goodboy/tractor + + +==== Abstract (1200 char limit!) ==== + +tractor is a distributed structured concurrency (SC) multi-processing runtime for Python built on trio. It applies SC throughout a distributed process tree, implementing a "Single program system" (SPS) akin to the EVM. Actors interact via a "supervision-control protocol" (SCP) enforced by typed IPC messaging, ensuring no child can outlive or zombie its parent despite memory-domain separation. + +Capabilities: infinitely nestable actor nurseries, bi-directional streaming with reliable teardown, modular IPC (TCP, UDS, QUIC via msgspec), multi-process debugging (pdbp), and cross-framework (asyncio, Qt) guest-mode hosting. + +This grant funds 7 milestones toward a stable 1.0: +1. Typed messaging protocols - msgspec.Struct dialog specs +2. Documentation - tutorials, D2 diagrams, usage guides +3. Erlang-style supervision APIs - composable restart strategies +4. Next-gen discovery + multiaddr addressing +5. Encrypted transports - TLS, wireguard, QUIC +6. High-perf IPC - eventfd + shm ring buffers, TIPC +7. Sub-interpreter spawning backend (PEP 734) + +Secondary: stabilized API, cross-language SCP potential, beta PyPI release. + + +==== Have you been involved with projects or organisations relevant to this project before? ==== + +I am the creator and primary maintainer of tractor, the project's inception came out of a grander project piker (https://github.com/pikers/piker) a ground-up FOSS computational trading stack. The project originally grew from practical needs building real-time financial data processing systems requiring extremely robust multi-core, multiple language (including Python) execution environments with grokable failure semantics. + +I am a long time active contributor and participant in the trio and surrounding structured concurrency community and have contributed to SC shaping discussions around semantics and interfaces generally in Python land. tractor itself maintains a very lightly trafficked Matrix channel but myself and fellow contributors engage more frequently with the broader async ecosystem and our surrounding dependee projects. + +Prior to tractor, I worked extensively with Python's multiprocessing, asyncio and dependent projects, various actor-like-systems/protocols in other langs (elixir/erlang, akka, 0mq), all of which informed (and which continue to) the design decisions - particularly the rejection of proxy objects, mailbox abstractions, shared-memory (vs. msg-passing), lack of required supervision primitives (and arguably the required IPC augmentation for such semantics) in favor of this burgeoning concept of "SC-native process supervision". + + +==== Requested Amount (EUR) ==== + +50000 + + +==== Explain what the requested budget will be used for ==== + +Work is performed by the core maintainer (EUR 50/hr) and vetted contributors delegated specific sub-tasks (EUR 35/hr). Hour estimates below use a blended average. The budget breaks down across 7 work packages matching the milestones above: + +WP1: Typed messaging and dialog protocols (EUR 10,000) +- Define msgspec.Struct-based message schemas for all IPC primitives (RPC calls, streaming, context dialogs) +- Implement runtime type validation at IPC boundaries with Address types in the builtin codec (#410) +- Build capability-based dialog protocol negotiation between actors via typed "dialog specs" (#196, #311) +- Write comprehensive test coverage and migration guide +- Refs: #36, #196, #311, #410 + +WP2: Documentation and tutorials (EUR 6,000) +- Write Sphinx-based user guide covering core APIs, patterns, and deployment +- Create tutorial series (single-host, multi-host, asyncio integration) with D2-generated architecture diagrams +- Update and expand existing examples to demonstrate new typed protocols and supervisors +- Publish beta release to PyPI with changelog and migration notes + +WP3: Erlang-style supervision strategies (EUR 7,000) +- Design and implement composable supervisor context managers supporting one-for-one, one-for-all, and rest-for-one restart strategies +- Integrate with existing ActorNursery and error propagation machinery including Python 3.11+ exception groups +- Add configurable restart limits, backoff policies, and supervision tree introspection +- Test under chaos-engineering fault injection scenarios +- Ref: #22 + +WP4: Discovery system + multi-addressing (EUR 7,000) +- Replace naive Registrar with a pluggable discovery sub-system supporting multiple backends +- Implement multiaddr-based addressing for actor endpoints enabling protocol-agnostic service location (#216, #367) +- Integrate typed Address structs into the IPC codec (#410) +- Add registrar/daemon fixture hardening for CI (#424) +- Refs: #216, #367, #410, #424, #429 + +WP5: Encrypted transport backends (EUR 7,000) +- Add TLS encryption alts for TCP-based inter-host actor links (#136, #353). +- Investigate and prototype composition with tunnel protocols (wireguard, SSH, QUIC) for zero-config encryption (#382). +- Extend transport-matrix CI to cover encrypted paths (#420). +- Audit and fix edge cases in remote exception relay and cancellation under encrypted channel types. +- Refs: #136, #353, #382, #420 + +WP6: High-performance IPC transports (EUR 7,000) +- Harden existing eventfd + shared-memory ring buffer channels for local-host zero-copy IPC + * initial core-dev WIP patch: https://pikers.dev/goodboy/tractor/pulls/10 + * formalizing extension repo: https://github.com/guilledk/hotbaud +- Achieve macOS parity for shared-memory key handling (#423). +- Investigate TIPC as a kernel-native multi-host transport (#378). +- Benchmark and optimize against baseline TCP/UDS paths to quantify throughput gains. +- Refs: #339, #378, #423 + +WP7: Sub-interpreter spawning backend (EUR 6,000) +- Implement a trio-compatible spawning backend using CPython 3.13+ sub-interpreters (PEP 554/734) as an alternative to full process isolation for local-host actors. +- Design the isolation boundary: determine which tractor IPC primitives can leverage shared-memory within a single OS process vs. requiring the existing msg-passing path. +- Benchmark latency and throughput vs. subprocess spawning to quantify the overhead reduction. +- Ref: #379 + + +==== Does the project have other funding sources, both past and present? ==== + +No. tractor has been entirely self-funded namely through downstream dependent projects which themselves are similar WIPs in the FOSS computational (financial) trading space. It is developed as a completely volunteer and "as needed" open-source effort. There are no corporate sponsors, institutional backers, or prior grants; the only "funding" has come via piece-wise, proxy contract engineering work on said dependent projects. This would be the project's first official external funding. + + +==== Compare your own project with existing or historical efforts ==== + +vs. Python multiprocessing / concurrent.futures: These stdlib modules provide process pools but no structured lifecycle management. A crashed worker can leave the pool in undefined state; there is no supervision tree, no cross-process cancellation propagation, and no supervised streaming IPC. tractor enforces that every child process is bound to its parent's lifetime via SC nurseries. + +vs. Celery / Dramatiq / other task queues: Task queues require external brokers (Redis, RabbitMQ), operate on a fire-and-forget model, and provide no structured error propagation. tractor eliminates the broker dependency, provides bidirectional streaming, and guarantees that remote errors propagate to the calling scope. It can be thought of as the 0mq of runtimes to the AMQP; it takes sophistication to a lower level allowing for easily building any distributed architecture you can imagine. + +vs. Ray / Dask: These target data-parallel and ML workloads with cluster schedulers and also contain no adherence to SC. They use proxy objects and shared-memory abstractions in ways that break SC guarantees. tractor targets general distributed programming and can easily accomplish similar feats arguably with less code and surrounding tooling. + +vs. Erlang/OTP (BEAM): tractor is directly inspired by OTP's supervision trees but implements them in Python using trio's rigorous SC primitives (nurseries/task-groups, cancel-scopes, thread-isolation) rather than a custom VM. We aim to bring OTP-grade reliability to the Python ecosystem without requiring developers to leave their existing language and toolchain and further use the same primitives to eventually do the same in other langs. + +vs. trio-parallel: trio-parallel solves a narrower problem: running sync functions in worker processes. tractor provides the full actor runtime - nested process trees, bidirectional streaming, remote debugging, and distributed deployment, etc. + +vs. Actor frameworks (Pykka, Thespian): These implement actor patterns atop threads or asyncio but do not enforce structured concurrency. Actors can outlive their creators, errors can be silently dropped, and there is no systematic cancellation. tractor is SC from the ground up. + + +==== What significant technical challenges do you expect to solve during the project? ==== + +1. Type-safe IPC without performance regression: Introducing msgspec.Struct-based typed message validation at every IPC boundary must not degrade throughput. The challenge is designing a schema layer that enables zero-copy deserialization while providing meaningful runtime type checking. Further, encoding Address types (#410) into the builtin codec requires careful interaction with msgspec's extension type system to avoid per-message allocation overhead. + +2. Composable supervision under SC constraints: Erlang's OTP supervisors rely on process linking and message-based monitoring. Translating these patterns into trio's task-nursery and cancellation-scope model - where parent scopes must outlive children - requires novel composition of context managers and careful interaction with Python 3.11+ exception groups. The ActorNursery must support restart strategies without violating the SC invariant that a crashed child's resources are fully reclaimed before any restart attempt. + +3. Discovery without a single point of failure: The current Registrar is a single root-actor service; replacing it with a distributed or pluggable discovery backend (potentially via multiaddr endpoint negotiation, #216, #367) must not introduce split-brain or stale-entry races. Achieving this while keeping the bootstrap path simple (a new actor needs some way to find its first peer) is an open design problem, particularly across heterogeneous transports (TCP vs. UDS vs. QUIC). + +4. TLS in a dynamic actor topology: Actors spawn and connect dynamically. Implementing mutual TLS authentication without a centralized certificate authority, while supporting both long-lived daemons and ephemeral workers, requires a lightweight trust model compatible with ad-hoc process tree formation. Composition with tunnel protocols (wireguard, SSH) adds another axis: the transport layer must remain pluggable so encryption can be provided externally rather than baked into every channel. + +5. Zero-copy shared-memory IPC across platforms: The existing eventfd + SharedMemory ring buffer implementation (tractor.ipc._ringbuf) is Linux-specific. Achieving macOS parity (#423) requires dealing with platform differences in POSIX shared-memory key formats and synchronization primitives. Beyond portability, ensuring the ring buffer remains safe under actor cancellation (partial writes, interrupted reads) without leaking shared-memory segments is non-trivial. + +6. Cross-platform debugging under encrypted transports: The multi-process pdbp debugger currently relies on unencrypted IPC for TTY lock coordination. Adding naive TLS must not break the debugging experience, requiring careful layering of debug-control messages and/or requiring embedded tunnelling. + +7. Sub-interpreter isolation boundaries: CPython 3.13+ sub-interpreters (#379) share an OS process but provide semi-isolated VMs. Determining which tractor IPC primitives can safely operate within a shared address space - and which require the existing process-boundary msg-passing path - is uncharted territory. The spawning backend must present the same ActorNursery interface regardless of whether the child is a subprocess or a sub-interpreter, preserving SC semantics while exploiting reduced spawn overhead. + +8. API stabilization without breaking SC invariants: Moving from alpha to beta means committing to a public API surface. The challenge is identifying which internal interfaces can be safely frozen vs. which need further iteration, while ensuring that any API changes preserve the runtime's SC guarantees. Documenting this surface clearly (WP2) is itself a forcing function for resolving ambiguities in the current internal API. + + +==== Describe the ecosystem of the project, and how you will engage with relevant actors and promote outcomes ==== + +tractor operates within the Python async/concurrency ecosystem, primarily adjacent to the trio community: + +Upstream dependencies: +- trio/anyio (structured concurrency runtimes) - we track upstream development closely and participate in design discussions. +- msgspec (high-performance serialization) - our typed messaging work will provide real-world feedback to the msgspec maintainer. +- pdbp (debugger REPL) - we actively collaborate on fixes. + +User communities: +- Python developers building distributed systems who need stronger guarantees than multiprocessing or alts provide. +- The trio user community seeking SC + parallelism. +- Scientific computing users wanting robust process supervision without Dask/Ray's deployment complexity. +- FOSS computational trader via the aforementioned piker. +- AI model users, emphasizing the need for SC to ensure reliable (and graceful) "kill signals" in supporting runtimes in an effort to avoid a real world "skynet". + +Engagement plan: +- Maintain active Matrix channel (#tractor:matrix.org) for user support and contributor onboarding. +- Publish milestone blog posts on the trio Discourse forum. +- Present at Python (and distributed-compute) conferences (PyCon, EuroPython) if accepted. +- Contribute learnings about distributed SC back to the trio project's design discussions. +- Engage with the broader SC community (Kotlin coroutines, Swift structured concurrency, Java Loom) to cross-pollinate ideas. +- All code, documentation, and design documents released under AGPL-3.0-or-later on GitHub, with mirrors on sourcehut and self-hosted Gitea. + + +==== Call topic ==== + +NGI Zero Commons Fund + + +==== Use of generative AI ==== + +I have used generative AI in writing this proposal + + +==== Which model did you use? What did you use it for? ==== + +Model: Claude Opus 4.6 (Anthropic), via the claude-code CLI tool. + +Usage: The AI was used to generate a first draft of all form field responses based on the project's existing documentation (README, pyproject.toml, git history, issue tracker). The NLnet submission form was fetched and parsed to identify all required fields and their guidance text. All draft responses were then reviewed, edited, and refined by the project maintainer before submission. + +The unedited AI output and prompts are available in the project repository under funding/ on the ngi0_submission branch: + +https://github.com/goodboy/tractor/tree/ngi0_submission/funding + +Prompt logs per revision: +- initial draft: funding/nlnet_ngi0_commons_draft.prompt.md +- milestone expansion: funding/nlnet_ngi0_commons_draft_rework.prompt.md +- WP7/proofread pass: funding/nlnet_ngi0_commons_draft_pass3.prompt.md +- final paste-ready pass: funding/nlnet_ngi0_commons_draft_final.prompt.md + + +============================================================ +END OF FORM FIELDS +============================================================ +Don't forget: +- [x] Check the privacy acknowledgment checkbox +- [ ] Optionally check "Send me a copy of this application" +- [ ] FILL IN PHONE NUMBER above before pasting +============================================================ diff --git a/funding/nlnet_ngi0_commons_draft.md b/funding/nlnet_ngi0_commons_draft.md new file mode 100644 index 000000000..721c0d085 --- /dev/null +++ b/funding/nlnet_ngi0_commons_draft.md @@ -0,0 +1,449 @@ +# NLnet NGI Zero Commons Fund - `tractor` Grant Application Draft + +> **Call:** NGI Zero Commons Fund (12th call) +> **Deadline:** April 1, 2026 at 12:00 CEST +> **Budget range:** EUR 5,000 - 50,000 +> **Status:** FIRST DRAFT - needs review and refinement + +--- + +## Contact Information + +> **Your name** + +Tyler Goodlet + +--- + +> **Email address** + +goodboy_foss@protonmail.com + +--- + +> **Phone number** + +`[TODO: fill in]` + +--- + +> **Organisation** + +Independent but with a surrounding (dependent project) community, https://pikers.dev + +--- + +> **Country** + +Global but with affiliated community meetups often in CDMX, Mexico. + +--- + +## General Project Information + +> **Proposal name** + +tractor: distributed structured concurrency + +--- + +> **Website / wiki** + +https://github.com/goodboy/tractor + +--- + +> **Abstract: Can you explain the whole project and its expected +> outcome(s). Be short and to the point; focus on what and how, not +> why.** + +`tractor` is a distributed structured concurrency (SC) +multi-processing runtime for Python built on `trio`. It applies SC +throughout a distributed process tree, implementing a "Single +program system" (SPS) akin to the EVM. Actors interact via a +"supervision-control protocol" (SCP) enforced by typed IPC +messaging, ensuring no child can outlive or zombie its parent +despite memory-domain separation. + +Capabilities: infinitely nestable actor nurseries, bi-directional +streaming with reliable teardown, modular IPC (TCP, UDS, QUIC via +`msgspec`), multi-process debugging (`pdbp`), and cross-framework +(`asyncio`, Qt) guest-mode hosting. + +This grant funds 7 milestones toward a stable 1.0: +1. Typed messaging protocols - `msgspec.Struct` dialog specs +2. Documentation - tutorials, D2 diagrams, usage guides +3. Erlang-style supervision APIs - composable restart strategies +4. Next-gen discovery + multiaddr addressing +5. Encrypted transports - TLS, wireguard, QUIC +6. High-perf IPC - `eventfd` + shm ring buffers, TIPC +7. Sub-interpreter spawning backend (PEP 734) + +Secondary: stabilized API, cross-language SCP potential, beta +PyPI release. + + + +--- + +> **Have you been involved with projects or organisations relevant to +> this project before? And if so, can you tell us a bit about that?** + +I am the creator and primary maintainer of `tractor`, the project's +inception came out of a grander project `piker` a ground-up FOSS +computational trading stack. The project originally grew from +practical needs building real-time financial data processing systems +requiring extremely robust multi-core, multiple language (including +Python) execution environments with grokable failure semantics. + +I am a long time active contributor and participant in the `trio` and +surrounding structured concurrency community and have contributed to +SC shaping discussions around semantics and interfaces generally in +Python land. `tractor` itself maintains a very lightly trafficked +Matrix channel but myself and fellow contributors engage more +frequently with the broader async ecosystem and our surrounding +dependee projects. + +Prior to `tractor`, I worked extensively with Python's +`multiprocessing`, `asyncio` and dependent projects, various +actor-like-systems/protocols in other langs (elixir/erlang, akka, +0mq), all of which informed (and which continue to) the design +decisions - particularly the rejection of proxy objects, mailbox +abstractions, shared-memory (vs. msg-passing), lack of required +supervision primitives (and arguably the required IPC augmentation +for such semantics) in favor of this burgeoning concept of "SC-native +process supervision". + +--- + +## Requested Support + +> **Requested Amount (in euros)** + +EUR 50,000 + +--- + +> **Explain what the requested budget will be used for. Provide a +> breakdown of main tasks and effort with explicit rates. Full budget +> may be attached.** + +Work is performed by the core maintainer (EUR 50/hr) and vetted +contributors delegated specific sub-tasks (EUR 35/hr). Hour +estimates below use a blended average. The budget breaks down +across 7 work packages matching the milestones above: + +**WP1: Typed messaging and dialog protocols (EUR 10,000)** +- Define `msgspec.Struct`-based message schemas for all IPC + primitives (RPC calls, streaming, context dialogs) +- Implement runtime type validation at IPC boundaries with + `Address` types in the builtin codec (#410) +- Build capability-based dialog protocol negotiation between + actors via typed "dialog specs" (#196, #311) +- Write comprehensive test coverage and migration guide +- Refs: #36, #196, #311, #410 + +**WP2: Documentation and tutorials (EUR 6,000)** +- Write Sphinx-based user guide covering core APIs, patterns, + and deployment +- Create tutorial series (single-host, multi-host, asyncio + integration) with D2-generated architecture diagrams +- Update and expand existing examples to demonstrate new typed + protocols and supervisors +- Publish beta release to PyPI with changelog and migration + notes + +**WP3: Erlang-style supervision strategies (EUR 7,000)** +- Design and implement composable supervisor context managers + supporting one-for-one, one-for-all, and rest-for-one + restart strategies +- Integrate with existing `ActorNursery` and error propagation + machinery including Python 3.11+ exception groups +- Add configurable restart limits, backoff policies, and + supervision tree introspection +- Test under chaos-engineering fault injection scenarios +- Ref: #22 + +**WP4: Discovery system + multi-addressing (EUR 7,000)** +- Replace naive `Registrar` with a pluggable discovery + sub-system supporting multiple backends +- Implement `multiaddr`-based addressing for actor endpoints + enabling protocol-agnostic service location (#216, #367) +- Integrate typed `Address` structs into the IPC codec (#410) +- Add registrar/daemon fixture hardening for CI (#424) +- Refs: #216, #367, #410, #424, #429 + +**WP5: Encrypted transport backends (EUR 7,000)** +- Add TLS encryption alts for TCP-based inter-host actor links + (#136, #353). +- Investigate and prototype composition with tunnel protocols + (wireguard, SSH, QUIC) for zero-config encryption (#382). +- Extend transport-matrix CI to cover encrypted paths (#420). +- Audit and fix edge cases in remote exception relay and + cancellation under encrypted channel types. +- Refs: #136, #353, #382, #420 + +**WP6: High-performance IPC transports (EUR 7,000)** +- Harden existing `eventfd` + shared-memory ring buffer channels for + local-host zero-copy IPC, + * initial core-dev WIP patch: https://pikers.dev/goodboy/tractor/pulls/10 + * formalizing extension repo: https://github.com/guilledk/hotbaud +- Achieve macOS parity for shared-memory key handling (#423). +- Investigate TIPC as a kernel-native multi-host transport (#378). +- Benchmark and optimize against baseline TCP/UDS paths to + quantify throughput gains. +- Refs: #339, #378, #423 + +**WP7: Sub-interpreter spawning backend (EUR 6,000)** +- Implement a `trio`-compatible spawning backend using CPython + 3.13+ sub-interpreters (PEP 554/734) as an alternative to + full process isolation for local-host actors. +- Design the isolation boundary: determine which `tractor` + IPC primitives can leverage shared-memory within a single + OS process vs. requiring the existing msg-passing path. +- Benchmark latency and throughput vs. subprocess spawning to + quantify the overhead reduction. +- Ref: #379 + +--- + +> **Does the project have other funding sources, both past and +> present? (if so, please describe)** + +No. `tractor` has been entirely self-funded namely through downstream +dependent projects which themselves are similar WIPs in the FOSS +computational (financial) trading space. It is developed as +a completely volunteer and "as needed" open-source effort. There are +no corporate sponsors, institutional backers, or prior grants; the +only "funding" has come via piece-wise, proxy contract engineering +work on said dependent projects. This would be the project's first +official external funding. + +--- + +> **Compare your own project with existing or historical efforts.** + +**vs. Python `multiprocessing` / `concurrent.futures`**: These stdlib +modules provide process pools but no structured lifecycle management. +A crashed worker can leave the pool in undefined state; there is no +supervision tree, no cross-process cancellation propagation, and no +supervised streaming IPC. `tractor` enforces that every child process +is bound to its parent's lifetime via SC nurseries. + +**vs. Celery / Dramatiq / other task queues**: Task queues require +external brokers (Redis, RabbitMQ), operate on a fire-and-forget +model, and provide no structured error propagation. `tractor` +eliminates the broker dependency, provides bidirectional streaming, +and guarantees that remote errors propagate to the calling scope. It +can be thought of as the 0mq of runtimes to the AMQP; it takes +sophistication to a lower level allowing for easily building any +distributed architecture you can imagine. + +**vs. Ray / Dask**: These target data-parallel and ML workloads with +cluster schedulers and also contain no adherence to SC. They use +proxy objects and shared-memory abstractions in ways that break SC +guarantees. `tractor` targets general distributed programming and can +easily accomplish similar feats arguably with less code and +surrounding tooling. + +**vs. Erlang/OTP (BEAM)**: `tractor` is directly inspired by OTP's +supervision trees but implements them in Python using `trio`'s +rigorous SC primitives (nurseries/task-groups, cancel-scopes, +thread-isolation) rather than a custom VM. We aim to bring OTP-grade +reliability to the Python ecosystem without requiring developers to +leave their existing language and toolchain and further use the same +primitives to eventually do the same in other langs. + +**vs. `trio-parallel`**: `trio-parallel` solves a narrower problem: +running sync functions in worker processes. `tractor` provides the +full actor runtime - nested process trees, bidirectional streaming, +remote debugging, and distributed deployment, etc. + +**vs. Actor frameworks (Pykka, Thespian)**: These implement actor +patterns atop threads or `asyncio` but do not enforce structured +concurrency. Actors can outlive their creators, errors can be +silently dropped, and there is no systematic cancellation. `tractor` +is SC from the ground up. + +--- + +> **What significant technical challenges do you expect to solve +> during the project?** + +1. **Type-safe IPC without performance regression**: Introducing + `msgspec.Struct`-based typed message validation at every IPC + boundary must not degrade throughput. The challenge is designing + a schema layer that enables zero-copy deserialization while + providing meaningful runtime type checking. Further, encoding + `Address` types (#410) into the builtin codec requires careful + interaction with `msgspec`'s extension type system to avoid + per-message allocation overhead. + +2. **Composable supervision under SC constraints**: Erlang's OTP + supervisors rely on process linking and message-based monitoring. + Translating these patterns into `trio`'s task-nursery and + cancellation-scope model - where parent scopes *must* outlive + children - requires novel composition of context managers and + careful interaction with Python 3.11+ exception groups. The + `ActorNursery` must support restart strategies without violating + the SC invariant that a crashed child's resources are fully + reclaimed before any restart attempt. + +3. **Discovery without a single point of failure**: The current + `Registrar` is a single root-actor service; replacing it with + a distributed or pluggable discovery backend (potentially via + `multiaddr` endpoint negotiation, #216, #367) must not introduce + split-brain or stale-entry races. Achieving this while keeping + the bootstrap path simple (a new actor needs *some* way to find + its first peer) is an open design problem, particularly across + heterogeneous transports (TCP vs. UDS vs. QUIC). + +4. **TLS in a dynamic actor topology**: Actors spawn and connect + dynamically. Implementing mutual TLS authentication without a + centralized certificate authority, while supporting both + long-lived daemons and ephemeral workers, requires a lightweight + trust model compatible with ad-hoc process tree formation. + Composition with tunnel protocols (wireguard, SSH) adds another + axis: the transport layer must remain pluggable so encryption + can be provided externally rather than baked into every channel. + +5. **Zero-copy shared-memory IPC across platforms**: The existing + `eventfd` + `SharedMemory` ring buffer implementation + (`tractor.ipc._ringbuf`) is Linux-specific. Achieving macOS + parity (#423) requires dealing with platform differences in + POSIX shared-memory key formats and synchronization primitives. + Beyond portability, ensuring the ring buffer remains safe under + actor cancellation (partial writes, interrupted reads) without + leaking shared-memory segments is non-trivial. + +6. **Cross-platform debugging under encrypted transports**: The + multi-process `pdbp` debugger currently relies on unencrypted + IPC for TTY lock coordination. Adding naive TLS must not break + the debugging experience, requiring careful layering of + debug-control messages and/or requiring embedded tunnelling. + +7. **Sub-interpreter isolation boundaries**: CPython 3.13+ + sub-interpreters (#379) share an OS process but provide + semi-isolated VMs. Determining which `tractor` IPC primitives + can safely operate within a shared address space - and which + require the existing process-boundary msg-passing path - is + uncharted territory. The spawning backend must present the + same `ActorNursery` interface regardless of whether the child + is a subprocess or a sub-interpreter, preserving SC semantics + while exploiting reduced spawn overhead. + +8. **API stabilization without breaking SC invariants**: Moving + from alpha to beta means committing to a public API surface. + The challenge is identifying which internal interfaces can be + safely frozen vs. which need further iteration, while ensuring + that any API changes preserve the runtime's SC guarantees. + Documenting this surface clearly (WP2) is itself a forcing + function for resolving ambiguities in the current internal API. + +--- + +> **Describe the ecosystem of the project, and how you will engage +> with relevant actors and promote outcomes.** + +`tractor` operates within the Python async/concurrency ecosystem, +primarily adjacent to the `trio` community: + +**Upstream dependencies:** +- `trio`/`anyio` (structured concurrency runtimes) - we track + upstream development closely and participate in design discussions. +- `msgspec` (high-performance serialization) - our typed messaging + work will provide real-world feedback to the `msgspec` maintainer. +- `pdbp` (debugger REPL) - we actively collaborate on fixes. + +**User communities:** +- Python developers building distributed systems who need stronger + guarantees than `multiprocessing` or alts provide. +- The `trio` user community seeking SC + parallelism. +- Scientific computing users wanting robust process supervision + without Dask/Ray's deployment complexity. +- FOSS computational trader via the aforementioned `piker`. +- AI model users, emphasizing the need for SC to ensure reliable (and + graceful) "kill signals" in supporting runtimes in an effort to + avoid a real world "skynet". + +**Engagement plan:** +- Maintain active Matrix channel (`#tractor:matrix.org`) for user + support and contributor onboarding. +- Publish milestone blog posts on the `trio` Discourse forum. +- Present at Python (and distributed-compute) conferences (PyCon, + EuroPython) if accepted. +- Contribute learnings about distributed SC back to the `trio` + project's design discussions. +- Engage with the broader SC community (Kotlin coroutines, Swift + structured concurrency, Java Loom) to cross-pollinate ideas. +- All code, documentation, and design documents released under + AGPL-3.0-or-later on GitHub, with mirrors on sourcehut and + self-hosted Gitea. + +--- + +## Thematic Call Selection + +> **Call topic** + +NGI Zero Commons Fund + +--- + +## Generative AI Disclosure + +> **Use of generative AI** + +I have used generative AI in writing this proposal. + +--- + +> **Which model did you use? What did you use it for?** + +Model: Claude Opus 4.6 (Anthropic), via the `claude-code` CLI tool. + +Usage: The AI was used to generate a first draft of all form field +responses based on the project's existing documentation (README, +pyproject.toml, git history, issue tracker). The NLnet submission form +was fetched and parsed to identify all required fields and their +guidance text. All draft responses were then reviewed, edited, and +refined by the project maintainer before submission. + +The unedited AI output and prompts are available in the project +repository under `funding/` on the `ngi0_submission` branch: + +https://github.com/goodboy/tractor/tree/ngi0_submission/funding + +Prompt logs per revision: +- initial draft: funding/nlnet_ngi0_commons_draft.prompt.md +- milestone expansion: funding/nlnet_ngi0_commons_draft_rework.prompt.md +- WP7/proofread pass: funding/nlnet_ngi0_commons_draft_pass3.prompt.md +- final paste-ready pass: funding/nlnet_ngi0_commons_draft_final.prompt.md + +--- + +## Notes for Review + +### Before submitting, address these TODOs: + +- [ ] Fill in phone number +- [x] Fill in organization (or confirm "Independent") +- [x] Fill in country +- [x] Review and refine the **Abstract** - is the scope right? Too + ambitious? Trim or expand milestones as needed. +- [x] Validate the **budget breakdown** - are the hourly rate and + hour estimates reasonable? Adjust WP allocations. +- [x] Review **requested amount** - EUR 50,000 is the max; consider + whether a smaller, more focused ask is strategically better. +- [x] Decide which **issues/PRs** to highlight most prominently +- [ ] Consider whether to attach a **roadmap PDF** with more detail +- [x] Review the **comparison section** - add/remove competitors as + appropriate +- [x] Refine the **ecosystem** section with specific community + contacts or partnerships +- [x] Save prompt logs for AI disclosure attachment +- [x] Proofread everything for accuracy and tone +- [x] ensure all field char-limits met (none enforced). +- [ ] **Submit before April 1, 2026 12:00 CEST** diff --git a/funding/nlnet_ngi0_commons_draft.prompt.md b/funding/nlnet_ngi0_commons_draft.prompt.md new file mode 100644 index 000000000..f7b2c940a --- /dev/null +++ b/funding/nlnet_ngi0_commons_draft.prompt.md @@ -0,0 +1,46 @@ +# Prompt input for NLnet grant draft generation + +## User prompt (verbatim) + +> read the grant submission form here, +> +> https://nlnet.nl/propose/ +> +> see if you can take each field and generate a first-draft set of +> responses for `tractor`. ensure you write up a plan in md which +> quotes each form field's header and then produce a submission +> (matching the respective character limits if you can detect them) +> for each field. write it all up for me to review and refine in a +> single md document under a new ./funding/ subdir. + +## Model & tool + +- **Model:** Claude Opus 4.6 (Anthropic) +- **Tool:** `claude-code` CLI (https://github.com/anthropics/claude-code) +- **Date:** 2026-03-31 + +## What the AI did + +1. Fetched and parsed the NLnet proposal form at + `https://nlnet.nl/propose/` to extract all field headers, + descriptions, and guidance text. +2. Fetched the NGI Zero Commons Fund page at + `https://nlnet.nl/commonsfund/` and the themes overview at + `https://nlnet.nl/themes/` to identify the best-fit funding call. +3. Read the project's `docs/README.rst`, `pyproject.toml`, and + explored the codebase structure, git history (~4,700 commits), + issue tracker references, and test suite to gather context. +4. Generated a first-draft response for every form field, structured + as quoted field headers followed by draft content. +5. Wrote the output to `funding/nlnet_ngi0_commons_draft.md`. + +## Sources consulted (by the AI) + +- `https://nlnet.nl/propose/` - submission form fields +- `https://nlnet.nl/commonsfund/` - fund details, budget range, + deadline +- `https://nlnet.nl/themes/` - available funding calls +- `docs/README.rst` - project description, features, roadmap +- `pyproject.toml` - metadata, dependencies, license +- Git log and codebase exploration - commit counts, contributors, + architecture diff --git a/funding/nlnet_ngi0_commons_draft_final.prompt.md b/funding/nlnet_ngi0_commons_draft_final.prompt.md new file mode 100644 index 000000000..ca1842196 --- /dev/null +++ b/funding/nlnet_ngi0_commons_draft_final.prompt.md @@ -0,0 +1,56 @@ +# Prompt input for NLnet grant final paste-ready pass + +## User prompt (verbatim) + +> ok for remaining todos, i only really care about the final +> submit, do you think you can make the web request and then +> provide me the final filled-in web form page to click the +> button? +> +> [after AI explained it can't POST forms] +> +> yes to both please. also make sure you fill in any useful +> links for the AI prompt tracking. + +## Model & tool + +- **Model:** Claude Opus 4.6 (Anthropic) +- **Tool:** `claude-code` CLI (https://github.com/anthropics/claude-code) +- **Date:** 2026-04-01 + +## What the AI did + +1. Read the latest committed draft to pick up all user + edits (branch link, checked-off TODOs, etc.) + +2. Updated AI disclosure section with specific prompt + file links per revision: + - `funding/nlnet_ngi0_commons_draft.prompt.md` + - `funding/nlnet_ngi0_commons_draft_rework.prompt.md` + - `funding/nlnet_ngi0_commons_draft_pass3.prompt.md` + - `funding/nlnet_ngi0_commons_draft_final.prompt.md` + And pointed the repo URL to the `/funding` subdir: + `https://github.com/goodboy/tractor/tree/ngi0_submission/funding` + +3. Generated `funding/nlnet_ngi0_PASTE_READY.txt`: + a flat plain-text file with one clearly delimited + section per form field, no markdown, ready to + copy-paste sequentially into the NLnet web form. + - Expanded all `#NNN` shorthand issue refs into + full `https://github.com/goodboy/tractor/issues/NNN` + URLs in the abstract (form fields are plain text, + not markdown-rendered). + - Added `https://github.com/pikers/piker` link in + the prior-involvement section. + - Included end-of-file checklist reminder for phone + number and privacy checkbox. + +4. Created this prompt log file for AI disclosure. + +## Sources consulted + +- `funding/nlnet_ngi0_commons_draft.md` (latest revision) +- `git log --oneline` and `git branch --show-current` + to confirm branch name `ngi0_submission` +- `funding/*.prompt.md` glob to enumerate existing + prompt logs diff --git a/funding/nlnet_ngi0_commons_draft_pass3.prompt.md b/funding/nlnet_ngi0_commons_draft_pass3.prompt.md new file mode 100644 index 000000000..a87972180 --- /dev/null +++ b/funding/nlnet_ngi0_commons_draft_pass3.prompt.md @@ -0,0 +1,77 @@ +# Prompt input for NLnet grant draft pass 3 + +## User prompt (verbatim) + +> some notes, +> +> - you missed linking in the multiaddress pr #429 to the WP4 +> - you missed linking in the TIPC issue in WP6 +> - actually you missed a buncha issue links XD +> +> ok i've committed a new revision, do a pass on it to learn +> what you missed, check the todo bullets and based on those +> try to complete as many of them as you can before the next +> commit (including saving this prompt and your output). + +## Model & tool + +- **Model:** Claude Opus 4.6 (Anthropic) +- **Tool:** `claude-code` CLI (https://github.com/anthropics/claude-code) +- **Date:** 2026-04-01 + +## What the user had already added (that the AI missed) + +Issue/PR refs the user manually added between passes: +- Milestone 4 (discovery): #429 (multiaddr PR using + upstream `py-multiaddr`) +- Milestone 5 (encrypted transports): #136, #353, #382 + (TLS/tunnel protocol issues) +- Milestone 6 (high-perf IPC): #339 (`eventfd` + shm), + #378 (TIPC kernel transport) +- New milestone 7: #379 (sub-interpreter spawning backend) +- WP6: links to pikers.dev WIP patch and `hotbaud` repo +- Ecosystem: AI model users / "skynet" safety angle +- Supervision milestone: link to GH label filter URL +- AI disclosure: `` placeholder + +## What the AI did in this pass + +1. **Proofread and fixed typos/grammar:** + - "improved a stabilized" -> "a stabilized" + - "it take sophistication" -> "it takes" + - "arguably with with" -> double-word removal + - "encrypted channels types" -> "encrypted channel types" + +2. **Budget redistribution for 7 WPs** (was 6): + - WP1: EUR 11k -> 10k (typed messaging) + - WP3: EUR 9k -> 8k (supervision) + - WP5: EUR 8k -> 7k (encrypted transports) + - WP6: EUR 8k -> 7k (high-perf IPC) + - WP7: EUR 6k NEW (sub-interpreter spawning, #379) + - Total: still EUR 50k + +3. **Added issue refs to WP5** body text: + #136, #353, #382 inline + Refs line + +4. **Added WP7** for sub-interpreter spawning backend + with PEP 554/734 references and #379 + +5. **Extended technical challenges** from 7 to 8: + - New #7: sub-interpreter isolation boundaries + - Renumbered API stabilization to #8 + +6. **Checked char limits**: NLnet form has no enforced + field-level character limits (confirmed via form HTML) + +7. **Updated TODO checklist**: marked off proofread, + prompt logs, and char-limits items + +## Sources consulted + +- NLnet form HTML at `https://nlnet.nl/propose/` + (char limit check) +- Codebase grep for #136, #339, #353, #378, #379, + #382, #429 issue references +- `tractor/ipc/_ringbuf.py` - eventfd/shm details +- `tractor/discovery/_multiaddr.py` - multiaddr/wg +- PEP 554/734 context for sub-interpreter support diff --git a/funding/nlnet_ngi0_commons_draft_rework.prompt.md b/funding/nlnet_ngi0_commons_draft_rework.prompt.md new file mode 100644 index 000000000..b113ffd95 --- /dev/null +++ b/funding/nlnet_ngi0_commons_draft_rework.prompt.md @@ -0,0 +1,65 @@ +# Prompt input for NLnet grant draft rework + +## User prompt (verbatim) + +> ok now i want you to fillout and/or rework a couple sections to +> match the new content, +> +> - abstract: i expanded to 6 goals, try to scan the repo content +> including all issues and PRs to add links (or issue #s is +> shorter) for all the milestones mentioned. +> +> - given the above expansion of scope^ adjust the budget section +> including optional reduced fee-per-hr for sub-collaborator +> delegation. +> +> - given the expanded abstract extend the "technical challenges" +> section, again from any info you can glean from the repo's +> surrounding info base to further expand the hurdle set. + +## Model & tool + +- **Model:** Claude Opus 4.6 (Anthropic) +- **Tool:** `claude-code` CLI (https://github.com/anthropics/claude-code) +- **Date:** 2026-04-01 + +## What the AI did + +1. Launched an exploration agent to scan the full codebase for + issue/PR references relevant to each of the 6 milestones: + - grep'd for `#NNN` patterns across source, tests, docs, and + `gh/*.md` planning files + - cross-referenced `docs/README.rst` TODO section, inline + `XXX`/`TODO` comments, and `pyproject.toml` notes + - catalogued findings per-milestone + +2. **Abstract**: added issue refs to milestones 1, 4, 5, 6: + - typed messaging: added #410 (`Address` types in msgspec) + - discovery + addressing: added #216, #367, #410, #424 + - encrypted transports: added #420 (transport-matrix CI) + - high-perf IPC: added #423 (macOS shm key parity) + +3. **Budget**: redistributed EUR 50k across 6 WPs (was 4), + introduced dual hourly rate (EUR 50/hr maintainer, + EUR 35/hr vetted contributors), and aligned each WP to + its corresponding abstract milestone with per-WP issue refs. + +4. **Technical challenges**: expanded from 5 to 7 items, + adding: + - discovery split-brain / bootstrap problem (#216, #367) + - cross-platform shared-memory ring buffer safety (#423) + - tightened existing items with issue-specific detail + (#410 codec overhead, exception groups in supervision) + - linked docs (WP2) as a forcing function for API + stabilization + +## Sources consulted (by the AI) + +- Full codebase grep for `#[0-9]+` issue references +- `tractor/ipc/_ringbuf.py`, `tractor/ipc/_linux.py` - + eventfd/shm implementation details +- `tractor/discovery/` subpackage - registrar, multiaddr +- `tractor/msg/types.py`, `tractor/msg/_ops.py` - msg codec +- `gh/*.md` - per-issue/PR planning documents +- `docs/README.rst` - TODO/roadmap section +- `tests/test_ringbuf.py` - shm test coverage