-
Notifications
You must be signed in to change notification settings - Fork 46
Initial NEMO implementation #528
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
PR image build and manifest generation completed successfully! 📦 PR image: 📦 LMES driver image: 📦 LMES job image: 📦 Guardrails orchestrator image: 🗂️ CI manifests |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
1 similar comment
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
ae68cd9 to
b6cb3c0
Compare
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. WalkthroughThis PR introduces a new NemoGuardrails Custom Resource Definition (CRD) and its complete controller implementation to the trustyai-service-operator. It adds API types, reconciliation logic, RBAC configuration, Kubernetes manifest templates, and utilities for CA bundle handling. The PR also refactors existing OAuth authentication checks into a centralized utility function across the guardrailsorchestrator controller. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes
Possibly related PRs
Suggested labels
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 9
🧹 Nitpick comments (9)
controllers/nemo_guardrails/constants.go (1)
1-7: NemoGuardrails constants are coherent with config and manager wiring
ServiceName = "NEMO_GUARDRAILS"matches the--enable-servicesflag, and the image keys align with the params files. As an optional polish, you might add a short GoDoc comment onServiceNamefor lint friendliness.controllers/nemo_guardrails/templates/kube-rbac-proxy-config.tmpl.yaml (1)
1-34: LGTM! Secure kube-rbac-proxy configuration.The ConfigMap template correctly configures the RBAC proxy with FIPS 140-2 approved cipher suites and TLS 1.2 minimum version. The authorization resource attributes and upstream configuration are properly templated.
Consider adding a trailing newline at the end of the file to follow conventional practice.
config/samples/trustyai_v1alpha1_nemoguardrail.yaml (1)
8-9: Complete the sample specification.The sample manifest contains only a placeholder TODO comment. A complete example would help users understand how to configure NemoGuardrail resources.
Would you like me to generate a sample spec based on the NemoGuardrails CRD schema and the PR objectives? This could include example ConfigMap references for nemo-config and other relevant fields.
config/base/params.env (1)
17-17: Consider pinning the image to a specific version.The
nemo-guardrails-imageuses the:latesttag. While this may be acceptable for development, consider pinning to a specific version for production deployments to ensure reproducibility and avoid unexpected changes.Note: This pattern is consistent with other images in this file (e.g.,
trustyai-service:latest,guardrails-orchestrator:latest), so this may be intentional for the base configuration with overrides expected in environment-specific overlays.controllers/nemo_guardrails/nemoguardrail_controller_test.go (1)
19-72: Implement the commented-out controller tests.The entire test suite for the NemoGuardrails controller is commented out, leaving the controller without test coverage. Consider implementing these tests to ensure the reconciliation logic works correctly.
Do you want me to help generate a basic test implementation for the NemoGuardrails controller reconciliation, or would you prefer to open an issue to track this task?
controllers/nemo_guardrails/suite_test.go (1)
64-65: Consider externalizing the Kubernetes version.The Kubernetes version "1.29.0" is hardcoded here. If this needs to be updated in the future, it could be missed. Consider defining this as a constant at the package or module level for easier maintenance.
Apply this diff to define the version as a constant:
+const testK8sVersion = "1.29.0" + var _ = BeforeSuite(func() { logf.SetLogger(zap.New(zap.WriteTo(GinkgoWriter), zap.UseDevMode(true))) By("bootstrapping test environment") testEnv = &envtest.Environment{ CRDDirectoryPaths: []string{filepath.Join("..", "..", "config", "crd", "bases")}, ErrorIfCRDPathMissing: true, - BinaryAssetsDirectory: filepath.Join("..", "..", "bin", "k8s", - fmt.Sprintf("1.29.0-%s-%s", runtime.GOOS, runtime.GOARCH)), + BinaryAssetsDirectory: filepath.Join("..", "..", "bin", "k8s", + fmt.Sprintf("%s-%s-%s", testK8sVersion, runtime.GOOS, runtime.GOARCH)), }controllers/nemo_guardrails/nemoguardrail_controller.go (1)
158-158: Consider making the requeue interval configurable.The reconciliation loop always requeues after 30 seconds. This could create unnecessary load if the controller is managing many NemoGuardrails instances. Consider making this interval configurable or only requeuing when necessary (e.g., when resources are not ready).
Consider this approach:
_, updateErr := r.reconcileStatuses(ctx, nemoGuardrails) if updateErr != nil { return ctrl.Result{}, updateErr } - return ctrl.Result{Requeue: true, RequeueAfter: 30 * time.Second}, nil + // Only requeue if resources are not ready + if nemoGuardrails.Status.Phase != utils.PhaseReady { + return ctrl.Result{Requeue: true, RequeueAfter: 30 * time.Second}, nil + } + return ctrl.Result{}, nilcontrollers/nemo_guardrails/ca.go (1)
147-156: Consider using consistent retry strategy.The
updateCAStatusWithRetryfunction usesretry.DefaultBackoff, while theupdateStatusfunction instatus.gousesretry.DefaultRetry. Consider using the same retry strategy for consistency across status updates.Apply this diff for consistency:
func (r *NemoGuardrailsReconciler) updateCAStatusWithRetry(ctx context.Context, namespacedName types.NamespacedName, caStatus *nemoguardrailsv1alpha1.CAStatus) error { - return retry.RetryOnConflict(retry.DefaultBackoff, func() error { + return retry.RetryOnConflict(retry.DefaultRetry, func() error { instance := &nemoguardrailsv1alpha1.NemoGuardrails{}config/crd/bases/trustyai.opendatahub.io_nemoguardrails.yaml (1)
180-202: Consider markingnameas required for NemoConfig items.The
namefield is used to create a directory at/app/config/$Name. An empty or missing name could result in invalid mount paths or undefined behavior. Consider addingnameto the required fields for NemoConfig.name: description: Name sets the id of this particular config within the NeMo Guardrails server. This will create a directory called /app/config/$Name type: string + required: + - name type: object
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (45)
PROJECT(1 hunks)api/common/ca_bundle.go(1 hunks)api/nemo_guardrails/v1alpha1/groupversion_info.go(1 hunks)api/nemo_guardrails/v1alpha1/nemoguardrails_types.go(1 hunks)api/nemo_guardrails/v1alpha1/zz_generated.deepcopy.go(1 hunks)cmd/main.go(2 hunks)config/base/params.env(1 hunks)config/crd/bases/trustyai.opendatahub.io_nemoguardrails.yaml(1 hunks)config/crd/kustomization.yaml(1 hunks)config/manager/manager.yaml(1 hunks)config/overlays/odh/params.env(1 hunks)config/overlays/rhoai/params.env(1 hunks)config/rbac/kustomization.yaml(1 hunks)config/rbac/nemoguardrail_editor_role.yaml(1 hunks)config/rbac/nemoguardrail_viewer_role.yaml(1 hunks)config/rbac/role.yaml(3 hunks)config/samples/kustomization.yaml(1 hunks)config/samples/trustyai_v1alpha1_nemoguardrail.yaml(1 hunks)controllers/constants/version.go(1 hunks)controllers/controllers.go(1 hunks)controllers/gorch/config_generation.go(2 hunks)controllers/gorch/constants.go(0 hunks)controllers/gorch/deployment.go(1 hunks)controllers/gorch/guardrailsorchestrator_controller.go(2 hunks)controllers/gorch/oauth.go(0 hunks)controllers/gorch/route.go(1 hunks)controllers/gorch/service.go(1 hunks)controllers/nemo_guardrails.go(1 hunks)controllers/nemo_guardrails/ca.go(1 hunks)controllers/nemo_guardrails/constants.go(1 hunks)controllers/nemo_guardrails/deployment.go(1 hunks)controllers/nemo_guardrails/nemoguardrail_controller.go(1 hunks)controllers/nemo_guardrails/nemoguardrail_controller_test.go(1 hunks)controllers/nemo_guardrails/status.go(1 hunks)controllers/nemo_guardrails/suite_test.go(1 hunks)controllers/nemo_guardrails/templates/ca-bundle-configmap.tmpl.yaml(1 hunks)controllers/nemo_guardrails/templates/deployment.tmpl.yaml(1 hunks)controllers/nemo_guardrails/templates/kube-rbac-proxy-config.tmpl.yaml(1 hunks)controllers/nemo_guardrails/templates/parser.go(1 hunks)controllers/nemo_guardrails/templates/route.tmpl.yaml(1 hunks)controllers/nemo_guardrails/templates/service.tmpl.yaml(1 hunks)controllers/utils/auth.go(1 hunks)controllers/utils/ca.go(1 hunks)controllers/utils/configmap.go(0 hunks)controllers/utils/logging.go(1 hunks)
💤 Files with no reviewable changes (3)
- controllers/gorch/constants.go
- controllers/utils/configmap.go
- controllers/gorch/oauth.go
🧰 Additional context used
🧬 Code graph analysis (13)
controllers/gorch/config_generation.go (1)
controllers/utils/auth.go (1)
RequiresAuth(10-13)
controllers/gorch/deployment.go (1)
controllers/utils/auth.go (1)
RequiresAuth(10-13)
controllers/gorch/route.go (1)
controllers/utils/auth.go (1)
RequiresAuth(10-13)
controllers/gorch/guardrailsorchestrator_controller.go (1)
controllers/utils/auth.go (1)
RequiresAuth(10-13)
controllers/nemo_guardrails/suite_test.go (2)
api/nemo_guardrails/v1alpha1/groupversion_info.go (1)
AddToScheme(35-35)controllers/lmes/config.go (1)
Options(37-48)
controllers/nemo_guardrails.go (4)
controllers/nemo_guardrails/constants.go (1)
ServiceName(4-4)controllers/gorch/constants.go (1)
ServiceName(11-11)controllers/gorch/guardrailsorchestrator_controller.go (1)
ControllerSetUp(73-80)controllers/nemo_guardrails/nemoguardrail_controller.go (1)
ControllerSetUp(169-176)
controllers/nemo_guardrails/deployment.go (8)
api/nemo_guardrails/v1alpha1/nemoguardrails_types.go (1)
NemoGuardrails(78-84)controllers/gorch/deployment.go (1)
ContainerImages(19-23)controllers/nemo_guardrails/nemoguardrail_controller.go (1)
NemoGuardrailsReconciler(39-44)controllers/utils/configmap.go (2)
GetImageFromConfigMap(80-90)MountConfigMapToDeployment(93-105)controllers/constants/version.go (1)
ConfigMap(5-5)controllers/utils/logging.go (2)
LogErrorRetrieving(17-19)LogErrorControllerReference(42-44)controllers/utils/auth.go (1)
RequiresAuth(10-13)controllers/nemo_guardrails/templates/parser.go (1)
ParseResource(13-15)
api/nemo_guardrails/v1alpha1/nemoguardrails_types.go (1)
api/nemo_guardrails/v1alpha1/groupversion_info.go (1)
SchemeBuilder(32-32)
controllers/utils/auth.go (1)
controllers/constants/version.go (1)
AuthAnnotationKey(6-6)
controllers/nemo_guardrails/templates/parser.go (1)
controllers/utils/parser.go (1)
ParseResourceFromFS(31-45)
controllers/gorch/service.go (1)
controllers/utils/auth.go (1)
RequiresAuth(10-13)
controllers/nemo_guardrails/ca.go (2)
controllers/utils/ca.go (3)
CABundleInitContainerConfig(55-70)CABundleSourceVolume(46-53)CreateCABundleInitContainer(74-143)controllers/utils/configmap.go (2)
GetConfigMapByName(52-62)MountConfigMapToDeployment(93-105)
api/nemo_guardrails/v1alpha1/zz_generated.deepcopy.go (1)
api/nemo_guardrails/v1alpha1/nemoguardrails_types.go (6)
CAStatus(51-58)NemoConfig(29-36)NemoGuardrailStatus(61-72)NemoGuardrails(78-84)NemoGuardrailsList(89-93)NemoGuardrailsSpec(39-49)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Sourcery review
- GitHub Check: build
- GitHub Check: deploy
- GitHub Check: build-and-push-ci
🔇 Additional comments (40)
controllers/controllers.go (1)
24-25: Import reordering is non-functionalReordered
slices/stringsimports keep behavior unchanged and match their existing usage in this file.config/manager/manager.yaml (1)
36-36: NEMO_GUARDRAILS service flag wiring looks consistentAdding
NEMO_GUARDRAILSto--enable-servicesaligns withServiceName = "NEMO_GUARDRAILS"and will enable the NemoGuardrails controller as long as it is registered viaregisterServiceunder that name.config/overlays/odh/params.env (1)
18-18: NemoGuardrails image param key matches controller constants
nemo-guardrails-imagealigns withnemoGuardrailsImageKeyand mirrors existing image param conventions; no issues.controllers/constants/version.go (1)
4-6: Auth annotation constant addition is soundIntroducing
AuthAnnotationKey = "security.opendatahub.io/enable-auth"centralizes the annotation name and should reduce string drift; just ensure all auth-related code uses this constant rather than hard-coding the key.controllers/utils/logging.go (1)
36-39: Reconciliation error logging helper is consistent
LogErrorReconcilingfollows the same pattern as the other logging helpers and produces clear, consistent error messages for reconcile failures.config/overlays/rhoai/params.env (1)
18-19: RHOAI overlay image params look alignedBoth the updated
garak-provider-imageand newnemo-guardrails-imageentries match the naming and image conventions used elsewhere (and the NemoGuardrails image key matches controller expectations).PROJECT (1)
54-62: Repository access unavailable for verificationThe verification script could not execute due to a repository clone failure. Without access to the codebase, I cannot confirm whether the NemoGuardrails API path declared in PROJECT (
github.com/trustyai-explainability/trustyai-service-operator/api/nemo/v1alpha1) matches the actual Go package location where the types are defined.The original concern remains valid and requires manual verification: ensure the
pathin PROJECT exactly matches where the NemoGuardrails type definitions actually live in the repository (checking for potential discrepancies likeapi/nemo_guardrails/v1alpha1vs.api/nemo/v1alpha1).api/common/ca_bundle.go (2)
3-18: LGTM! Well-designed CA bundle configuration type.The CABundleConfig struct is properly defined with appropriate validation constraints. The kubebuilder markers ensure CRD validation will enforce safe key names and reasonable limits.
20-38: DeepCopy implementation is correct.The DeepCopy methods properly handle the ConfigMapKeys slice, including nil checks. The implementation follows the standard pattern for deep copying structs with slices.
controllers/utils/ca.go (4)
13-22: LGTM! Proper validation constraints.The validation constants and regex correctly enforce Kubernetes DNS subdomain name constraints for ConfigMap keys.
24-44: LGTM! Comprehensive validation with defense in depth.The validation function properly checks for empty keys, length limits, character constraints, and path traversal attempts. The additional check for ".." and "/" provides defense in depth even though the regex should prevent these.
46-70: LGTM! Clear configuration types.The CABundleSourceVolume and CABundleInitContainerConfig types are well-structured with descriptive field names.
72-134: LGTM! Safe shell script construction with proper validation.The function correctly:
- Validates ConfigMap keys before use (line 86)
- Quotes file paths using
%qto prevent injection (line 98)- Embeds the file list directly in the script rather than passing as arguments
- Uses
set -efor proper error handling- Gracefully handles missing files with warnings to stderr
The approach of embedding validated and quoted paths in the script eliminates shell injection risks.
controllers/gorch/service.go (1)
17-17: LGTM! Consistent auth utility usage.The change correctly uses
utils.RequiresAuth()for determining auth proxy requirements, consistent with the refactoring applied across other controller files.controllers/nemo_guardrails.go (1)
23-25: LGTM! Standard controller registration.The init function correctly registers the NemoGuardrails controller using the same pattern as other controllers in the codebase (e.g., GORCH controller).
controllers/gorch/deployment.go (1)
72-77: LGTM! Consistent auth utility usage in deployment creation.The change correctly uses
utils.RequiresAuth()to determine whether to configure the Kube-RBAC proxy, maintaining consistency with the authentication logic refactoring across the codebase.controllers/gorch/config_generation.go (1)
13-13: I'm unable to verify the claims in this review comment due to persistent infrastructure limitations preventing access to the repository. Without being able to inspect the codebase, I cannot confirm whether:
- All
requiresOAuthreferences have been removedutils.RequiresAuthis used consistently across the codebase
Manual verification needed: Confirm old
requiresOAuthfunction is fully removed andutils.RequiresAuthis consistently applied.The refactoring to centralize authentication logic via
utils.RequiresAuth()is sound, but the specific verification steps suggested in the review cannot be automated. A maintainer should confirm that:
- No remaining
requiresOAuthreferences exist in the codebase- All authentication checks now use
utils.RequiresAuthconsistently- The changes at lines 349-351 align with the pattern at line 13
config/crd/kustomization.yaml (2)
5-5: LGTM!The addition of the NemoGuardrails CRD resource aligns with the new controller implementation.
10-10: Based on my search results, I can provide context but cannot definitively verify from the codebase due to access limitations. Here's what I found:Key Findings:
- CA injection purpose: CA injection patches are only needed when a CRD has webhooks (conversion, validation, or mutation webhooks) that require TLS certificate verification.
- NemoGuardrails integration pattern: Based on available documentation, NemoGuardrails typically integrates via HTTP/REST endpoints rather than Kubernetes webhooks at the CRD level.
- Integration style: TrustyAI's GuardrailsOrchestrator integrates with NemoGuardrails through HTTP calls to exposed service endpoints, not through CRD webhook mechanisms.
What I cannot verify without codebase access:
- Whether the NemoGuardrails CRD actually defines any validation, mutation, or conversion webhooks
- Whether this patch and corresponding patch file (
patches/cainjection_in_nemoguardrails.yaml) were ever created or are planned- Whether there's existing documentation about future webhook support
Verify if CA injection is needed for NemoGuardrails.
The commented patch suggests CA injection was considered for the NemoGuardrails CRD. CA injection is only necessary if the CRD uses webhooks (conversion, validation, or mutation). Verify whether:
- The NemoGuardrails CRD defines any webhook configurations
- If webhooks exist, ensure either the patch is uncommented and
patches/cainjection_in_nemoguardrails.yamlexists, or document this as planned future work- If no webhooks are used, remove the commented patch line and the corresponding comment
controllers/gorch/route.go (1)
17-19: LGTM!The refactoring to use the centralized
utils.RequiresAuthutility improves code maintainability and consistency across the codebase.config/rbac/kustomization.yaml (1)
11-12: LGTM!The addition of editor and viewer roles for NemoGuardrails follows the established pattern for other CRDs in the project.
config/rbac/role.yaml (3)
224-224: LGTM!The addition of
nemoguardrailsresources with full lifecycle permissions is appropriate for the manager role.
239-239: LGTM!Adding finalizers permissions allows the controller to properly manage resource cleanup.
248-248: LGTM!Status subresource permissions enable the controller to update the NemoGuardrails resource status.
controllers/utils/auth.go (1)
10-13: LGTM!The centralized authentication check logic is clean and appropriate. Using case-insensitive comparison for the annotation value is a good defensive practice.
config/samples/kustomization.yaml (1)
1-4: Unable to verify file existence due to repository access failure.The repository could not be cloned to perform the file existence check. Without access to the codebase, I cannot definitively verify whether
trustyai_v1alpha1_nemoguardrail.yamlexists in theconfig/samples/directory or identify any naming variations or related files.cmd/main.go (1)
26-26: LGTM!The NemoGuardrails v1alpha1 scheme registration follows the established pattern used for other API groups in this operator.
Also applies to: 70-70
controllers/nemo_guardrails/templates/ca-bundle-configmap.tmpl.yaml (1)
1-10: LGTM!The CA bundle ConfigMap template correctly uses OpenShift's service CA injection annotation to automatically populate the bundle.
config/rbac/nemoguardrail_viewer_role.yaml (1)
1-23: LGTM!The viewer role correctly grants read-only permissions (get, list, watch) for NemoGuardrails resources and status subresource access.
config/rbac/nemoguardrail_editor_role.yaml (1)
1-27: LGTM!The editor role correctly grants full management permissions on NemoGuardrails resources while restricting status updates to read-only, which is the standard pattern for status subresources managed by controllers.
controllers/nemo_guardrails/templates/service.tmpl.yaml (1)
1-32: LGTM!The Service template correctly configures conditional ports based on auth proxy usage and uses proper Kubernetes API enum values.
controllers/gorch/guardrailsorchestrator_controller.go (1)
183-183: LGTM!The refactoring to use the centralized
utils.RequiresAuth()helper improves maintainability by eliminating duplicate auth-checking logic across controllers.Also applies to: 260-260
controllers/nemo_guardrails/templates/parser.go (1)
1-15: LGTM!Clean template parser wrapper that properly embeds YAML templates and delegates to the utility function. The generic type parameter provides type safety for resource parsing.
api/nemo_guardrails/v1alpha1/groupversion_info.go (1)
17-36: LGTM!Standard Kubernetes API group-version wiring with proper kubebuilder annotations. The GroupVersion, SchemeBuilder, and AddToScheme are correctly configured for the v1alpha1 API group.
controllers/nemo_guardrails/deployment.go (1)
73-112: LGTM on the default config selection logic.The default config selection logic properly handles multiple scenarios:
- Uses the first config as a fallback default
- Allows explicit default marking via the
Defaultfield- Warns when multiple configs are marked as default
- Logs informative messages for debugging
This is well-designed and user-friendly.
api/nemo_guardrails/v1alpha1/nemoguardrails_types.go (1)
28-36: LGTM on the NemoConfig structure.The NemoConfig type is well-designed with clear documentation for each field. The Default flag provides good flexibility for managing multiple configurations.
controllers/nemo_guardrails/ca.go (2)
37-99: LGTM on CA bundle loading logic.The
LoadCAConfigsfunction properly:
- Attempts to load multiple CA sources (ODH, OpenShift, user-specified)
- Gracefully handles missing ConfigMaps without failing the entire operation
- Records detailed error information in the CAStatus for observability
- Returns the configuration needed for the init container
This is a well-structured approach to CA bundle management.
101-144: LGTM on CA deployment integration.The
AddCAToDeploymentfunction correctly:
- Mounts CA ConfigMaps as volumes
- Creates and attaches the init container
- Sets up the transfer volume (EmptyDir) with proper read-only mounting
- Configures the SSL_CERT_FILE environment variable
The implementation follows secure practices and proper Kubernetes patterns.
api/nemo_guardrails/v1alpha1/zz_generated.deepcopy.go (1)
1-27: LGTM - Auto-generated deepcopy implementations are correct.This file is generated by
controller-genand correctly implementsDeepCopy,DeepCopyInto, andDeepCopyObjectfor all NemoGuardrails API types. The implementations properly handle slices, pointers, and nested structs, matching the type definitions innemoguardrails_types.go.config/crd/bases/trustyai.opendatahub.io_nemoguardrails.yaml (1)
1-17: LGTM - CRD structure and naming conventions are correct.The CRD correctly defines the NemoGuardrails resource with:
- Proper API group and version registration
- Namespaced scope appropriate for tenant isolation
- Subresource status enabled for status updates
- Schema that aligns with the Go type definitions
| // INSERT ADDITIONAL SPEC FIELDS - desired state of cluster | ||
| // Important: Run "make" to regenerate code after modifying this file | ||
|
|
||
| // NemoConfig should be the names of the configmaps containing NeMO server configuration files. All files in NemoConfigs will be mounted to /app/config/rails-custom-action |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix inconsistent comment about mount path.
The comment on line 43 states that files will be mounted to /app/config/rails-custom-action, but the code in deployment.go (line 93) actually mounts them to /app/config/$Name where $Name is the NemoConfig name. This inconsistency could confuse users.
Apply this diff:
- // NemoConfig should be the names of the configmaps containing NeMO server configuration files. All files in NemoConfigs will be mounted to /app/config/rails-custom-action
+ // NemoConfig should be the names of the configmaps containing NeMO server configuration files. All files from each NemoConfig will be mounted to /app/config/<config-name>
NemoConfigs []NemoConfig `json:"nemoConfigs"`📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // NemoConfig should be the names of the configmaps containing NeMO server configuration files. All files in NemoConfigs will be mounted to /app/config/rails-custom-action | |
| // NemoConfig should be the names of the configmaps containing NeMO server configuration files. All files from each NemoConfig will be mounted to /app/config/<config-name> | |
| NemoConfigs []NemoConfig `json:"nemoConfigs"` |
🤖 Prompt for AI Agents
In api/nemo_guardrails/v1alpha1/nemoguardrails_types.go around line 43, the
field comment incorrectly states files are mounted to
/app/config/rails-custom-action but deployment.go mounts each NemoConfig to
/app/config/$Name; update the comment to accurately reflect the actual mount
path (e.g., "All files in NemoConfigs will be mounted to
/app/config/<NemoConfigName>") and ensure wording clarifies that each configmap
is mounted under /app/config/{Name} rather than a fixed rails-custom-action
directory.
| openshiftServingConfigMap, err := utils.GetConfigMapByName(ctx, r.Client, nemoGuardrails.Name+"-ca-bundle", nemoGuardrails.Namespace) | ||
| if err != nil { | ||
| logger.Error(err, "Could not find or load ODH trusted CA bundle configmap, so will not be mounted.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix copy-paste error in log message.
Line 68 logs an error message saying "ODH trusted CA bundle" when it should say "OpenShift serving CA bundle" based on the context (line 66).
Apply this diff:
openshiftServingConfigMap, err := utils.GetConfigMapByName(ctx, r.Client, nemoGuardrails.Name+"-ca-bundle", nemoGuardrails.Namespace)
if err != nil {
- logger.Error(err, "Could not find or load ODH trusted CA bundle configmap, so will not be mounted.")
+ logger.Error(err, "Could not find or load OpenShift serving CA bundle configmap, so will not be mounted.")
caStatus.OpenshiftServingCAFound = false📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| openshiftServingConfigMap, err := utils.GetConfigMapByName(ctx, r.Client, nemoGuardrails.Name+"-ca-bundle", nemoGuardrails.Namespace) | |
| if err != nil { | |
| logger.Error(err, "Could not find or load ODH trusted CA bundle configmap, so will not be mounted.") | |
| openshiftServingConfigMap, err := utils.GetConfigMapByName(ctx, r.Client, nemoGuardrails.Name+"-ca-bundle", nemoGuardrails.Namespace) | |
| if err != nil { | |
| logger.Error(err, "Could not find or load OpenShift serving CA bundle configmap, so will not be mounted.") | |
| caStatus.OpenshiftServingCAFound = false |
🤖 Prompt for AI Agents
In controllers/nemo_guardrails/ca.go around lines 66 to 68, the error log
message incorrectly references "ODH trusted CA bundle"; update the logger.Error
call to reference "OpenShift serving CA bundle" instead so the message reads
something like "Could not find or load OpenShift serving CA bundle configmap, so
will not be mounted." Keep the existing error variable and context parameters
unchanged.
| deployment, err = templateParser.ParseResource[*appsv1.Deployment](deploymentTemplateFilename, deploymentConfig, reflect.TypeOf(&appsv1.Deployment{})) | ||
| if err != nil { | ||
| utils.LogErrorParsing(ctx, err, "deployment template", nemoGuardrails.Name, nemoGuardrails.Namespace) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Return immediately after logging the parsing error.
If template parsing fails at line 60-62, the error is logged but execution continues. This will likely cause a nil pointer dereference at line 64 when trying to set the controller reference on a nil deployment.
Apply this diff:
deployment, err = templateParser.ParseResource[*appsv1.Deployment](deploymentTemplateFilename, deploymentConfig, reflect.TypeOf(&appsv1.Deployment{}))
if err != nil {
utils.LogErrorParsing(ctx, err, "deployment template", nemoGuardrails.Name, nemoGuardrails.Namespace)
+ return nil, nil, err
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| deployment, err = templateParser.ParseResource[*appsv1.Deployment](deploymentTemplateFilename, deploymentConfig, reflect.TypeOf(&appsv1.Deployment{})) | |
| if err != nil { | |
| utils.LogErrorParsing(ctx, err, "deployment template", nemoGuardrails.Name, nemoGuardrails.Namespace) | |
| } | |
| deployment, err = templateParser.ParseResource[*appsv1.Deployment](deploymentTemplateFilename, deploymentConfig, reflect.TypeOf(&appsv1.Deployment{})) | |
| if err != nil { | |
| utils.LogErrorParsing(ctx, err, "deployment template", nemoGuardrails.Name, nemoGuardrails.Namespace) | |
| return nil, nil, err | |
| } |
🤖 Prompt for AI Agents
In controllers/nemo_guardrails/deployment.go around lines 60-63, after
templateParser.ParseResource returns an error you currently only log it and
continue; immediately return from the function when err != nil to avoid a nil
deployment being used later (which causes a nil pointer deref). Change the error
branch to return the error (or a wrapped error/appropriate reconcile result per
the function signature) right after calling utils.LogErrorParsing so execution
stops and the nil deployment is never dereferenced.
|
|
||
| // ====== Fetch instance of NemoGuardrails CR ====================================================================== | ||
| nemoGuardrails := &nemoguardrailsv1alpha1.NemoGuardrails{} | ||
| err := r.Get(context.TODO(), req.NamespacedName, nemoGuardrails) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use the context parameter instead of context.TODO().
Line 61 uses context.TODO() instead of the provided ctx parameter. This could lead to issues with cancellation, tracing, and timeouts not being properly propagated.
Apply this diff:
- err := r.Get(context.TODO(), req.NamespacedName, nemoGuardrails)
+ err := r.Get(ctx, req.NamespacedName, nemoGuardrails)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| err := r.Get(context.TODO(), req.NamespacedName, nemoGuardrails) | |
| err := r.Get(ctx, req.NamespacedName, nemoGuardrails) |
🤖 Prompt for AI Agents
In controllers/nemo_guardrails/nemoguardrail_controller.go around line 61, the
code calls r.Get with context.TODO(), which ignores the provided ctx parameter
and prevents cancellation/tracing/timeouts from being propagated; replace
context.TODO() with the ctx parameter passed into the function so the request
uses the caller's context, ensuring proper cancellation and telemetry
propagation.
| func (r *NemoGuardrailsReconciler) reconcileStatuses(ctx context.Context, nemoGuardrails *nemoguardrailsv1alpha1.NemoGuardrails) (ctrl.Result, error) { | ||
| deploymentReady, _ := utils.CheckDeploymentReady(ctx, r.Client, nemoGuardrails.Name, nemoGuardrails.Namespace) | ||
| routeReady, _ := utils.CheckRouteReady(ctx, r.Client, nemoGuardrails.Name, nemoGuardrails.Namespace) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handle errors from readiness checks.
The errors returned by CheckDeploymentReady and CheckRouteReady are ignored. If these checks fail due to an error (not just unready status), the controller should handle this appropriately rather than treating it as "not ready."
Apply this diff to handle errors:
func (r *NemoGuardrailsReconciler) reconcileStatuses(ctx context.Context, nemoGuardrails *nemoguardrailsv1alpha1.NemoGuardrails) (ctrl.Result, error) {
- deploymentReady, _ := utils.CheckDeploymentReady(ctx, r.Client, nemoGuardrails.Name, nemoGuardrails.Namespace)
- routeReady, _ := utils.CheckRouteReady(ctx, r.Client, nemoGuardrails.Name, nemoGuardrails.Namespace)
+ deploymentReady, deploymentErr := utils.CheckDeploymentReady(ctx, r.Client, nemoGuardrails.Name, nemoGuardrails.Namespace)
+ if deploymentErr != nil {
+ log.FromContext(ctx).Error(deploymentErr, "Failed to check deployment readiness")
+ return ctrl.Result{}, deploymentErr
+ }
+ routeReady, routeErr := utils.CheckRouteReady(ctx, r.Client, nemoGuardrails.Name, nemoGuardrails.Namespace)
+ if routeErr != nil {
+ log.FromContext(ctx).Error(routeErr, "Failed to check route readiness")
+ return ctrl.Result{}, routeErr
+ }
if deploymentReady && routeReady {🤖 Prompt for AI Agents
In controllers/nemo_guardrails/status.go around lines 29 to 31, the errors
returned by CheckDeploymentReady and CheckRouteReady are ignored; update the
code to handle those errors by checking each returned error and returning an
appropriate non-nil error (or a requeue result) instead of treating an error as
“not ready.” Specifically, capture the two error values, and if either is
non-nil return ctrl.Result{}, err (or wrap with context) so the reconciler
surfaces transient failures rather than proceeding as if the resource is simply
unready.
| {{ if .UseAuthProxy }} | ||
| - {{- template "kubeRBACProxy" .GatewayKubeRBACProxy }} | ||
| {{end}} | ||
| volumes: | ||
| - name: {{.NemoGuardrails.Name}}-tls | ||
| secret: | ||
| secretName: {{.NemoGuardrails.Name}}-tls | ||
| defaultMode: 420 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add missing volume definitions for kube-rbac-proxy.
When UseAuthProxy is true, the kubeRBACProxy sidecar references volumes tls-certs, tls-ca-bundle, and kube-rbac-proxy-{{.Suffix}}-config (lines 47, 50, 53), but these volumes are not defined in the volumes section (lines 103-107). Only {{.NemoGuardrails.Name}}-tls is defined, which doesn't match what the sidecar expects. This will cause pod creation to fail.
Add the missing volume definitions:
{{end}}
volumes:
+ {{if .UseAuthProxy}}
+ - name: tls-certs
+ secret:
+ secretName: {{.NemoGuardrails.Name}}-tls
+ defaultMode: 420
+ - name: tls-ca-bundle
+ configMap:
+ name: {{.NemoGuardrails.Name}}-ca-bundle
+ - name: kube-rbac-proxy-{{.GatewayKubeRBACProxy.Suffix}}-config
+ configMap:
+ name: {{.NemoGuardrails.Name}}-kube-rbac-proxy-{{.GatewayKubeRBACProxy.Suffix}}-config
+ {{else}}
- name: {{.NemoGuardrails.Name}}-tls
secret:
secretName: {{.NemoGuardrails.Name}}-tls
defaultMode: 420
+ {{end}}🤖 Prompt for AI Agents
In controllers/nemo_guardrails/templates/deployment.tmpl.yaml around lines 100
to 107, the volumes section only defines {{.NemoGuardrails.Name}}-tls but the
kube-rbac-proxy sidecar (when .UseAuthProxy is true) mounts volumes named
tls-certs, tls-ca-bundle, and kube-rbac-proxy-{{.Suffix}}-config; add volume
entries with those exact names so the sidecar mounts resolve: define a tls-certs
volume pointing to the appropriate Secret (e.g. secretName:
{{.NemoGuardrails.Name}}-tls), define tls-ca-bundle as the ConfigMap or Secret
that provides the CA bundle used elsewhere, and define
kube-rbac-proxy-{{.Suffix}}-config as the ConfigMap referenced by the sidecar;
ensure defaultMode and field names match the other volume definitions.
| port: | ||
| targetPort: oauth-proxy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix port name mismatch with Service.
The Route targets oauth-proxy but the Service template (service.tmpl.yaml) defines ports named kube-rbac-proxy (when UseAuthProxy is true) or http (when UseAuthProxy is false). This mismatch will prevent the Route from forwarding traffic correctly.
Apply this diff to align with the Service port naming:
port:
- targetPort: oauth-proxy
+ targetPort: kube-rbac-proxy📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| port: | |
| targetPort: oauth-proxy | |
| port: | |
| targetPort: kube-rbac-proxy |
🤖 Prompt for AI Agents
In controllers/nemo_guardrails/templates/route.tmpl.yaml around lines 14 to 15,
the Route's targetPort is set to "oauth-proxy" which doesn't match the Service
port names ("kube-rbac-proxy" when UseAuthProxy is true or "http" when
UseAuthProxy is false); update the Route template to set targetPort to the same
port name the Service uses — implement the same conditional used in
service.tmpl.yaml (use "kube-rbac-proxy" if UseAuthProxy is true, otherwise
"http") or reference a shared variable for the port name so the Route and
Service port names always match.
| "strings" | ||
| ) | ||
|
|
||
| // requiresAuth checks if the auth annotation key is set in the resource |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix comment to match function name.
The comment references "requiresAuth" (lowercase) but the function is exported as "RequiresAuth" (uppercase). Update the comment for consistency.
Apply this diff:
-// requiresAuth checks if the auth annotation key is set in the resource
+// RequiresAuth checks if the auth annotation key is set in the resource📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // requiresAuth checks if the auth annotation key is set in the resource | |
| // RequiresAuth checks if the auth annotation key is set in the resource |
🤖 Prompt for AI Agents
In controllers/utils/auth.go around line 9, the top-of-function comment
references "requiresAuth" (lowercase) but the function is exported as
"RequiresAuth"; update the comment to use the exact exported identifier and Go
doc style by changing it to start with "RequiresAuth" (e.g., "RequiresAuth
checks if the auth annotation key is set in the resource.") so the comment
matches the function name and follows Go conventions.
| SecurityContext: &corev1.SecurityContext{ | ||
| AllowPrivilegeEscalation: &[]bool{false}[0], | ||
| RunAsNonRoot: &[]bool{false}[0], | ||
| Capabilities: &corev1.Capabilities{ | ||
| Drop: []corev1.Capability{"ALL"}, | ||
| }, | ||
| }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Security concern: RunAsNonRoot should be true.
The security context sets RunAsNonRoot: false, which allows the init container to run as root. For a container that only concatenates CA certificate files, root privileges are unnecessary.
Apply this diff to run the init container as non-root:
SecurityContext: &corev1.SecurityContext{
AllowPrivilegeEscalation: &[]bool{false}[0],
- RunAsNonRoot: &[]bool{false}[0],
+ RunAsNonRoot: &[]bool{true}[0],
Capabilities: &corev1.Capabilities{
Drop: []corev1.Capability{"ALL"},
},
},📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| SecurityContext: &corev1.SecurityContext{ | |
| AllowPrivilegeEscalation: &[]bool{false}[0], | |
| RunAsNonRoot: &[]bool{false}[0], | |
| Capabilities: &corev1.Capabilities{ | |
| Drop: []corev1.Capability{"ALL"}, | |
| }, | |
| }, | |
| SecurityContext: &corev1.SecurityContext{ | |
| AllowPrivilegeEscalation: &[]bool{false}[0], | |
| RunAsNonRoot: &[]bool{true}[0], | |
| Capabilities: &corev1.Capabilities{ | |
| Drop: []corev1.Capability{"ALL"}, | |
| }, | |
| }, |
🤖 Prompt for AI Agents
In controllers/utils/ca.go around lines 135 to 141, the SecurityContext sets
RunAsNonRoot to false allowing the init container to run as root; change
RunAsNonRoot to true and ensure the pod cannot escalate privileges (keep
AllowPrivilegeEscalation: false and Capabilities.Drop: ["ALL"]), and add a
non-root UID via RunAsUser (e.g. 1000) so the init container runs as a non-root
user when concatenating CA files.
|
@RobGeada: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Adds initial NEMO implementation:
Example deployment resource
$YOUR_MODEL_PREDICTOR_SERVICEas needed (e.g.,http://phi3-predictor.model-namespace.svc.cluster.local):Example curl:
returns:
Summary by CodeRabbit
Release Notes
New Features
RBAC & Security
✏️ Tip: You can customize this high-level summary in your review settings.