forked from kubernetes-sigs/gateway-api-inference-extension
-
Notifications
You must be signed in to change notification settings - Fork 0
v1 epp #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
v1 epp #1
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This reverts commit c01cbc8.
capri-xiyue
commented
Jul 15, 2025
capri-xiyue
commented
Jul 15, 2025
robscott
approved these changes
Jul 15, 2025
robscott
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @capri-xiyue!
|
Looks good to me! I dont think we have all the fun automation in your fork so feel free to merge whenev |
|
Or is this just for review? |
Owner
Author
This is just for review. The merge will happen in kubernetes-sigs#1118 |
…ters guide (kubernetes-sigs#1143) * add an "Implementing a Compatible Data Plane" section to the implementers guide. * minor cleanup * update bullet list formatting. * minor cleanup.
This commit introduces the core operational layer of the Flow Registry's sharded architecture. It deliberately focuses on implementing the `registryShard`—the concurrent, high-performance data plane for a single worker—as opposed to the top-level `FlowRegistry` administrative control plane, which will be built upon this foundation. The `registryShard` provides the concrete implementation of the `contracts.RegistryShard` port, giving each `FlowController` worker a safe, partitioned view of the system's state. This design is fundamental to achieving scalability by minimizing cross-worker contention on the hot path. The key components are: - **`registry.Config`**: The master configuration blueprint for the entire `FlowRegistry`. It is validated once and then partitioned, with each shard receiving its own slice of the configuration, notably for capacity limits. - **`registry.registryShard`**: The operational heart of this commit. It manages the lifecycle of queues and policies within a single shard, providing the read-oriented access needed by a `FlowController` worker. It ensures concurrency safety through a combination of mutexes for structural changes and lock-free atomics for statistics. - **`registry.managedQueue`**: A stateful decorator that wraps a raw `framework.SafeQueue`. Its two primary responsibilities are to enable the sharded model by providing atomic, upwardly-reconciled statistics, and to enforce lifecycle state (active vs. draining), which is essential for the graceful draining of flows during future administrative updates. - **Contracts and Errors**: New sentinel errors are added to the `contracts` package to create a clear, stable API boundary between the registry and its consumers. This work establishes the robust, scalable, and concurrent foundation upon which the top-level `FlowRegistry` administrative interface will be built.
This commit improves the foundational `types` package by consolidating documentation and adding a safer default enum value in preparation for the new `FlowController` implementation. It also removes concepts related to displacement, which are out of scope for the initial release. Key changes: - The package-level `README.md` has been removed. Its content is now in a comprehensive GoDoc package comment in `doc.go`. This addresses reviewer feedback to co-locate documentation with the code it describes, reducing maintenance burden and preventing doc drift. - The package documentation has been rewritten to tell a clearer architectural narrative, explaining the request lifecycle through the lens of the `EnqueueAndWait` model. - The `ErrDisplaced` error and `QueueOutcomeEvictedDisplaced` outcome have been removed to align the types with the GA feature set. - A new `QueueOutcomeNotYetFinalized` enum value has been added. This serves as a safer, explicit zero-value for the `QueueOutcome` type, which is used by the new `ShardProcessor` to represent the initial state of a request before its lifecycle is complete.
…ernetes-sigs#1198) Signed-off-by: Nir Rozenbaum <[email protected]>
…es-sigs#1116) * rename x-k8s to apix and add v1 InferencePool to api/v1 * updated docker file * fixed naming collision * added fake approved annotation * try to fix unit tests * fixed typo * added annotation * fixed annotation * Update api/v1/inferencepool_types.go Co-authored-by: Rob Scott <[email protected]> * Apply suggestions from code review Co-authored-by: Rob Scott <[email protected]> * use v1 type in v1alpha2 * clean unused depdendency * fixed code generator error * revert back * merge bob's change * upadted code-gen * remove deep copy needs * use codegen * revert to workable version * re-run generate * revert to workable version * change to use v1alpha2 back * resolve merge conflicts * fixed typo * fixed typo * fixed pipeline * fixed boilerplate Signed-off-by: Xiyue Yu <[email protected]> * updated missing dependency * fixed change * fixed format * fixed import issue * fixed format --------- Signed-off-by: Xiyue Yu <[email protected]> Co-authored-by: Rob Scott <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is a mirror of kubernetes-sigs#1118