-
Notifications
You must be signed in to change notification settings - Fork 30
AIDE migration #537
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AIDE migration #537
Conversation
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: Vincent056 The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
2a2f984 to
2e48413
Compare
|
/retest |
573bd4f to
2d23d2a
Compare
|
we will save migrated config in: |
|
@rhmdnd this should be ready for some reviews |
24fed09 to
130dbd0
Compare
rhmdnd
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Took an initial pass, but I still need to come back and look through the config map controller and file integrity controller logic.
Posting the feedback I have for now.
| COPY . . | ||
|
|
||
| RUN git clone https://github.com/autoconf-archive/autoconf-archive.git && \ | ||
| cp autoconf-archive/m4/*.m4 /usr/share/aclocal/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a dependency for building AIDE 0.18?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, I think rhel might has rpm for this, but this builder does not
| aide-0.18: | ||
| rm -rf aide && \ | ||
| git clone https://github.com/aide/aide.git && \ | ||
| cd aide && \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a build directory where we stash other related tools like controller-gen and kustomize. We could keep the entire repository in there so it gets cleaned up using make clean. Otherwise, we might need to go in and clean it up manually if we encounter a build failure.
| err = waitForAIDEConfigMigrationEvent(t, f, namespace, testName, "17") | ||
| if err != nil { | ||
| t.Errorf("Expected event to be found") | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice - thanks for adding this. So the remaining bit to test would be running AIDE 0.18.0, yeah? At this point, the test is assertion the configuration should work with 0.18.0, but we're still running 0.16.0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
correct
thanks for the detail reviews! |
03f6255 to
a2ffd69
Compare
8b33a9f to
e8d41fc
Compare
This PR adds ability to run aide 0.18 config checks and migration. The aide0.18 will be complied as binary and used by FIO in a container, we will try to see if we can perform the migration if a user defined config is being detected. We will issue warning messages in the log, and have a failed annotation key in the FIO instance if we are not able to pre migrate the config.
|
/retest |
|
/retest |
|
@Vincent056: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
rhmdnd
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good - a few more comments inline now that I've had more time to walk through the controller code and the tests.
| for { | ||
| select { | ||
| case <-migrateCtx.Done(): | ||
| DBG("Migration loop cancelled by the main routine!") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case - has the operation been aborted due to an error? Can, or should, the user do anything to restart the migration?
| return | ||
| } else if aideResult == 17 { | ||
| // This is an AIDE config line error. | ||
| newErr := fmt.Sprintf("Detected configuration error during the migration check for AIDE 0.18: %s, output: %s ", common.GetAideErrorMessage(aideResult), output) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool - this error moves us in the right direction I think by giving users the information they need to update AIDE configs. Curious to see how this renders in the logs with an actual AIDE error.
| ReinitDaemonSetPrefix = "aide-ini" | ||
|
|
||
| // AideConfigMigrationIgnoreAnnotationKey tells us to ignore the deprecated config options | ||
| AideConfigMigrationIgnoreAnnotationKey = "file-integrity.openshift.io/migration-ignore-deprecation" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: For additional clarity, we're not ignoring potential FIO deprecations, but AIDE deprecations. We could update that in the annotation to be more explicit:
AideConfigMigrationIgnoreAnnotationKey = "file-integrity.openshift.io/migration-ignore-aide-deprecation"
| func (r *ReconcileConfigMap) handleMigrationCheckLog(cm *corev1.ConfigMap, logger logr.Logger) (reconcile.Result, error) { | ||
| owner, err := common.GetConfigMapOwnerName(cm) | ||
| if err != nil { | ||
| logger.Error(err, "Malformed ConfigMap: Could not get owner. Cannot retry.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, we need the user to go adjust the ConfigMap manually, right? Can we include the error in the log output, or does it become unruly?
| delete(annotation, common.IntegrityLogErrorAnnotationKey) | ||
| } | ||
|
|
||
| delete(annotation, common.IntegrityMigrationUpdateAnnotationKey) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We delete this annotation because the migration has been successful?
|
|
||
| RUN make build | ||
|
|
||
| RUN make aide-0.18 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another alternative we talked about earlier was to build this using a separate ubi8 base image container and then copying it over, too.
Or, at least in this case, we could just copy it out of the fedora 40 image and see how that works.
| # install operator binary | ||
| COPY --from=builder /go/src/github.com/openshift/file-integrity-operator/build/bin/manager ${OPERATOR} | ||
| COPY build/bin /usr/local/bin | ||
| COPY --from=builder /go/src/github.com/openshift/file-integrity-operator/build/bin/aide-0.18 /usr/sbin/aide-0.18 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then we don't need to pollute the builder container image with utilities and code unrelated to golang, or building FIO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we'll also need to update the default AIDE configuration.
The way I understand the current logic is that FIO will continue to lay down AIDE configurations that are not compatible with 0.18.0. I think we'll want to change that as soon as possible, so long as the default AIDE configuration for 0.18.0 we want to use works on 0.16.0. That way we don't have the maintain the migration logic any longer than we absolutely have to.
| return | ||
| } | ||
|
|
||
| // Append operatorPods to pods |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: unrelated change?
| return nil | ||
| } | ||
|
|
||
| func waitForFIMigrationErrAnnotation(t *testing.T, f *framework.Framework, namespace, name, expectedMessage string) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we could use this as an assertion and not have to deal with checking the error output in the caller to fail the test, making it more of an assertion.
func assertAIDEMigrationEmitsAnAlert(...)And then a similar one below:
func assertAIDEMigrationIsSuccessful(...)But, I acknowledge this would be potentially different from existing conventions and we could do that in a separate PR if we decide to make that change across the entire functional test suite.
|
Issues go stale after 90d of inactivity. Mark the issue as fresh by commenting If this issue is safe to close now please do so with /lifecycle stale |
|
PR needs rebase. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
Stale issues rot after 30d of inactivity. Mark the issue as fresh by commenting If this issue is safe to close now please do so with /lifecycle rotten |
|
Rotten issues close after 30d of inactivity. Reopen the issue by commenting /close |
|
@openshift-bot: Closed this PR. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
This PR adds the ability to run aide 0.18 config checks and migration.
The aide0.18 will be compiled as an additional binary and used by FIO to check our migrated config. This new binary will reside in the container image, and we will launch a new pod to check the aide config migration every time we need to re-init.
Add config check to existing, upgrade case, check if migration checks needed annotation, if so we launch the pod, pod checks for aide migration and issue warning return result as a config map, the config map controller pick up the change, remove migration checks needed annotation.
We will try to perform the migration if a user-defined config is detected. We will issue warning messages in the log, and have a failed annotation key in the FIO instance if we cannot pre-migrate the config.
Annotation keys added:
example of event message:
AIDE Configuration Changes
Removed Features in AIDE v0.17
ignore_list
report_attributes
verbose (type: number, range: 0 - 255, default: 5)
log_levelandreport_leveloptions instead.New Features in AIDE v0.17
--log-levelor-Lcommand line option overwrites this option.New Features in AIDE v0.16
report_ignore_added_attrs (type: attribute expression, default: empty)
report_ignore_removed_attrs (type: attribute expression, default: empty)
report_ignore_e2fsattrs (type: string, default: 0)
report_force_attrs (type: attribute expression, default: empty)
Deprecated Features in AIDE v0.18 (to be removed in AIDE v0.20)
@@ifdef VARIABLE
@@if defined VARIABLE.@@ifndef VARIABLE
@@if not defined VARIABLE.@@ifhost HOSTNAME
@@if hostname HOSTNAME.@@ifnhost HOSTNAME
@@if not hostname HOSTNAME.Special attributes
growing+sattributes instead.Removed Features in AIDE v0.19