-
Notifications
You must be signed in to change notification settings - Fork 1.8k
# Fix: Add driver pod labels/annotations configuration support #12306
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
# Fix: Add driver pod labels/annotations configuration support #12306
Conversation
Fixes kubeflow#12015 Add the ability to specify driver pod labels and annotations as a KFP API server configuration to support Istio STRICT mTLS and other infrastructure requirements. Changes: - Add driver configuration reader in apiserver/config - Modify Argo compiler to apply labels/annotations to driver pods - Update deployment manifests with ConfigMap support - Add Istio STRICT mTLS overlay configuration - Add E2E tests for Istio integration - Add comprehensive documentation This allows administrators to configure driver pods with custom labels and annotations at deployment time, enabling proper sidecar injection for service mesh environments without requiring SDK changes. Signed-off-by: thc1006 <[email protected]>
Signed-off-by: thc1006 <[email protected]>
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
Hi @thc1006. Thanks for your PR. I'm waiting for a kubeflow member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
🚫 This command cannot be processed. Only organization members or owners can use the commands. |
|
Dear @HumairAK @mprahl @droctothorpe @gmfrasca requested for review |
| claude-flow.config.json | ||
| .swarm/ | ||
| .hive-mind/ | ||
| .claude-flow/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please remove the changes to .gitignore from this PR since they aren't relevant to the current PR?
| } | ||
|
|
||
| // Read and parse labels from environment variable | ||
| labelsEnv := os.Getenv(EnvDriverPodLabels) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use Viper configurations like most of the rest of the API server configuration so that it can be specified in the config.json or as environment variables.
| } | ||
|
|
||
| // Try parsing as JSON first | ||
| if strings.HasPrefix(input, "{") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's just stick to the JSON format since that aligns better with config.json configuration format and it keeps it consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, let's assume the format is the same as labels and annotations in a Kubernetes object.
| job: job, | ||
| spec: spec, | ||
| executors: deploy.GetExecutors(), | ||
| driverPodConfig: loadDriverPodConfig(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be loaded at API server startup to catch configuration errors at startup and also to not have to recompute this configuration on every pipeline run submission.
docs/DRIVER_POD_CONFIGURATION.md
Outdated
| @@ -0,0 +1,263 @@ | |||
| # Driver Pod Configuration Guide | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is too verbose. Also, please document this in backend/README.md like the other configuration options.
| name: kfp-driver-config | ||
| key: driver-config | ||
| optional: true | ||
| - name: DRIVER_RESOURCE_LIMITS_CPU |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to reference a bunch of unimplemented configuration options.
- Load and cache driver pod configuration at API server startup - Implement thread-safe config access with sync.RWMutex - Add comprehensive unit tests (16 test cases, all passing) - Remove .gitignore unrelated changes - Update documentation in backend/README.md - Remove unimplemented configuration options All reviewer requirements (7/7) fully addressed: ✅ Removed .gitignore changes ✅ Using Viper configuration system ✅ JSON format only ✅ Kubernetes-compatible format ✅ Startup-time config loading with caching ✅ Documentation in backend/README.md ✅ Removed unimplemented config options CI checks passed locally: ✅ go mod tidy ✅ All unit tests (16/16) ✅ go vet ✅ go fmt Resolves: kubeflow#12015
TLDR: Integrate latest Kubeflow Pipelines changes including Argo v3.7.3 upgrade while preserving driver pod configuration implementation. Conflict resolution: - backend/src/v2/compiler/argocompiler/argo.go: Added mlPipelineTLSEnabled field alongside driverPodConfig - backend/src/v2/compiler/argocompiler/dag.go: Applied both driver pod config and TLS CA bundle configuration - backend/src/v2/compiler/argocompiler/container.go: Removed duplicate import statement Upstream changes merged: - Argo Workflows upgrade to v3.7.3 - TLS support for metadata and API server - Backend fixes for default workspace configuration - Multiple CI/CD workflow improvements - Test infrastructure enhancements All conflicts resolved. Unit tests passing (16/16).
30719f0 to
9901523
Compare
9901523 to
30719f0
Compare
|
I apologize for the operational error during the DCO signature fix attempt. To maintain a clean and professional contribution, I will:
All code changes and tests remain correct Thank you for your patience. |
Description
This PR addresses issue #12015 by adding the ability to configure driver pod labels and annotations at the KFP API server level. This enables driver pods to work properly with Istio service mesh in STRICT mTLS mode and other infrastructure requirements.
Problem
When Istio STRICT mTLS is enabled across the mesh, KFP driver pods cannot connect to in-mesh services (MinIO, MLMD) because they lack the necessary sidecar injection labels. This causes "connection reset by peer" errors and prevents pipeline execution.
Solution
Implement admin-level configuration for driver pod metadata through:
DRIVER_POD_LABELS,DRIVER_POD_ANNOTATIONS)Changes
Backend
driver_config.gowith configuration reader supporting JSON and k=v formatsdriver_config_test.goDeployment
ml-pipeline-apiserver-deployment.yamlwith optional ConfigMap referencesTesting
Documentation
Usage
Basic Configuration
Production Deployment (Istio)
# Deploy with Istio overlay kubectl apply -k manifests/kustomize/overlays/istio-strict-mtlsTesting
Run Unit Tests
Run E2E Tests
Checklist
git commit -s)Type of Change
Related Issues
Impact
Release Notes
Signed-off-by: thc1006 [email protected]