K8SPSMDB-1571: Make reconciliation interval configurable#2221
K8SPSMDB-1571: Make reconciliation interval configurable#2221adutra wants to merge 4 commits intopercona:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR makes the Percona MongoDB Operator’s reconciliation interval configurable via a new RECONCILE_INTERVAL environment variable instead of being hardcoded to 5 seconds.
Changes:
- Introduced
getReconcileInterval()to derive the reconcile interval from theRECONCILE_INTERVALenvironment variable, defaulting to5s. - Updated
newReconcilerto usegetReconcileInterval()and added unit tests for the new helper. - Extended
deploy/operator.yamlanddeploy/cw-operator.yamlto set theRECONCILE_INTERVALenv var with a default value of5s.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| pkg/controller/perconaservermongodb/psmdb_controller.go | Replaces the hardcoded reconcileIn duration with getReconcileInterval() and defines the helper that reads RECONCILE_INTERVAL from the environment. |
| pkg/controller/perconaservermongodb/psmdb_controller_test.go | Adds unit tests covering unset, valid, and invalid RECONCILE_INTERVAL env var values. |
| deploy/operator.yaml | Wires the new RECONCILE_INTERVAL environment variable into the standard operator deployment manifest. |
| deploy/cw-operator.yaml | Wires the new RECONCILE_INTERVAL environment variable into the CloudWatch-oriented operator deployment manifest. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if interval := os.Getenv("RECONCILE_INTERVAL"); interval != "" { | ||
| if d, err := time.ParseDuration(interval); err == nil { | ||
| return d | ||
| } | ||
| } | ||
| return defaultInterval |
There was a problem hiding this comment.
getReconcileInterval accepts any duration that time.ParseDuration parses, including zero or negative values, and silently falls back to the default when parsing fails. Since the value is fed directly into reconcile.Result{RequeueAfter: r.reconcileIn}, a non-positive or malformed RECONCILE_INTERVAL can lead to confusing or unintended reconcile behavior with no log signal for operators. Consider validating that the parsed duration is strictly positive (and optionally above some minimum) before using it, and logging a warning when the env var is invalid or rejected so misconfiguration is easier to detect.
| if interval := os.Getenv("RECONCILE_INTERVAL"); interval != "" { | |
| if d, err := time.ParseDuration(interval); err == nil { | |
| return d | |
| } | |
| } | |
| return defaultInterval | |
| intervalStr := os.Getenv("RECONCILE_INTERVAL") | |
| if intervalStr == "" { | |
| return defaultInterval | |
| } | |
| d, err := time.ParseDuration(intervalStr) | |
| if err != nil { | |
| logf.Log.WithName("psmdb-controller").Info( | |
| "Invalid RECONCILE_INTERVAL value, using default", | |
| "value", intervalStr, | |
| "error", err, | |
| "default", defaultInterval, | |
| ) | |
| return defaultInterval | |
| } | |
| if d <= 0 { | |
| logf.Log.WithName("psmdb-controller").Info( | |
| "Non-positive RECONCILE_INTERVAL value, using default", | |
| "value", intervalStr, | |
| "default", defaultInterval, | |
| ) | |
| return defaultInterval | |
| } | |
| return d |
Yes, please fix golangci-lint and manifests |
| return defaultInterval | ||
| } | ||
|
|
||
| return d |
There was a problem hiding this comment.
I'm not sure if it makes sense to have a value 1second for example.
There was a problem hiding this comment.
good point, i agree. maybe it's better to not allow values less than 5 seconds and fallback to 5 seconds if value is smaller
|
|
||
| for _, tt := range tests { | ||
| t.Run(tt.name, func(t *testing.T) { | ||
| // Save original env value |
There was a problem hiding this comment.
I believe that this comment is unneeded, wdyt?
| } | ||
| }() | ||
|
|
||
| // Set test env value |
|
|
||
| got := getReconcileInterval() | ||
| if got != tt.want { | ||
| t.Errorf("getReconcileInterval() = %v, want %v", got, tt.want) |
There was a problem hiding this comment.
why don't we use assertions, the codebase has examples e.g. assert.Error
| t.Run(tt.name, func(t *testing.T) { | ||
| // Save original env value | ||
| originalValue, wasSet := os.LookupEnv("RECONCILE_INTERVAL") | ||
| defer func() { |
There was a problem hiding this comment.
I think all this logic is not needed, we can simply unsert the env after the end of each test.
We can do something like that:
defer func() {
err := os.Unsetenv("RECONCILE_INTERVAL")
require.NoError(t, err)
}()
if tt.setEnv {
err := os.Setenv("RECONCILE_INTERVAL", tt.envValue)
require.NoError(t, err)
}
Which is also handling the errors of setting and unsetting, wdyt?
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| defer func() { | ||
| err := os.Unsetenv("RECONCILE_INTERVAL") | ||
| require.NoError(t, err) | ||
| }() | ||
| if tt.setEnv { | ||
| err := os.Setenv("RECONCILE_INTERVAL", tt.envValue) | ||
| require.NoError(t, err) |
There was a problem hiding this comment.
This test permanently unsets the RECONCILE_INTERVAL environment variable and does not restore any previously configured value, which can cause hidden dependencies or flakiness if other tests or tooling rely on that variable. Capture the original value at the start of the test case and restore it in the deferred cleanup instead of always calling Unsetenv.
commit: 4621f83 |
egegunes
left a comment
There was a problem hiding this comment.
Requesting changes to block this getting merged before we create the v1.22.0 release branch. I'll re-approve afterwards.
Due to the high volume of requests, we're unable to provide free service for this account. To continue using the service, please upgarde to a paid plan.
CHANGE DESCRIPTION
Problem:
The Percona MongoDB Operator has a hardcoded 5-second reconciliation interval that generates excessive Kubernetes API requests (~12/second, ~42,000/hour per cluster). This should be configurable via environment variable to allow tuning for different deployment environments.
Cause:
The operator's reconciliation interval is hardcoded in
psmdb_controller.go.Solution:
Make interval configurable via a new environment variable
RECONCILE_INTERVAL.CHECKLIST
Jira
Needs Doc) and QA (Needs QA)?Tests
compare/*-oc.yml)?Config/Logging/Testability