-
Notifications
You must be signed in to change notification settings - Fork 392
fix: mount retry and logging #3157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR fixes the mount_into_container functionality by adding a retry loop for specific mount errors and updating the test function name to match the new behavior.
- Implements a retry mechanism for mount failures due to EINVAL or EBUSY errors.
- Renames the test function from test_mount_to_container to test_mount_into_container.
Comments suppressed due to low confidence (1)
crates/libcontainer/src/rootfs/mount.rs:567
- Consider adding a logging statement when the mount fails after exhausting retries to provide clearer context on the final error.
return Err(err.into());
utam0k
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May I ask you to consider the unit test?
| let max_retries: u32 = 3; | ||
| let mut retries = 0; | ||
| loop { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't prefer using the infinity loop. Please use for instead of that, at least.
| && retries < max_retries | ||
| { | ||
| retries += 1; | ||
| std::thread::sleep(std::time::Duration::from_millis(100)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is 100ms reasonable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
runc seems to set it to 100ms when doing the same thing.
https://github.com/opencontainers/runc/blob/8d90e3dba696ac787ee64de4445517ddf1063b04/libcontainer/rootfs_linux.go#L1232
But I don't know if this is reasonable 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a comment regarding where 100ms came from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For unit tests,we don't want to wait 100ms, so use cfg.
9d350fb to
d528cfc
Compare
|
|
||
| if let Err(err) = | ||
| self.syscall | ||
| let max_attempts: u32 = 3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please Make this a const variable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR fixes an issue with mount handling by adding retries for mounting and updating the associated test function name to reflect a recent rename.
- Introduces a constant for maximum mount attempts and wraps the mount syscall in a retry loop.
- Updates tests to simulate transient mount failures and verify successful retries or eventual failure.
Comments suppressed due to low confidence (1)
crates/libcontainer/src/rootfs/mount.rs:559
- Clarify the rationale for treating EINVAL as a transient error eligible for retry, since EINVAL typically indicates an invalid argument. Consider adding a comment to explain why retrying on EINVAL is safe in this context.
if (matches!(errno, Errno::EINVAL) || matches!(errno, Errno::EBUSY)) && attempt < MAX_MOUNT_ATTEMPTS - 1
utam0k
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use use as appropriate.
|
|
||
| if let Err(err) = | ||
| self.syscall | ||
| for attempt in 0..MAX_MOUNT_ATTEMPTS { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic of retry is not of interest here. Why not cut this out for utili and others?
fn retry<F, T, E, P>(
mut op: F,
attempts: usize,
delay: Duration,
policy: P,
) -> Result<T, E>
where
F: FnMut() -> Result<T, E>,
P: Fn(&E) -> bool,
{
fn go<F, T, E, P>( // Maybe “for” is easier to understand.
mut op: F,
remaining: usize,
delay: Duration,
policy: &P,
) -> Result<T, E>
where
F: FnMut() -> Result<T, E>,
P: Fn(&E) -> bool,
{
op().or_else(|err| {
if remaining > 1 && policy(&err) {
std::thread::sleep(delay);
go(op, remaining - 1, delay, policy)
} else {
Err(err)
}
})
}
go(&mut op, attempts, delay, &policy)
}1510251 to
1765e24
Compare
|
@z63d CI are failing, please take a look |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR fixes mount retry and logging issues for container mounts by introducing a generic retry function and refining the mount error handling logic.
- Introduces a generic retry function in utils.rs to encapsulate retry logic.
- Updates mount error handling in mount.rs to differentiate between EINVAL and EBUSY errors using retry.
- Renames the test function to match the updated function name.
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| crates/libcontainer/src/utils.rs | Added a generic retry function with delay and policy. |
| crates/libcontainer/src/rootfs/mount.rs | Updated mount error handling with retry logic and renamed test function. |
Comments suppressed due to low confidence (1)
crates/libcontainer/src/rootfs/mount.rs:558
- [nitpick] The use of 'mount_option_config.data' here differs from the use of 'Some(&*d)' in the retry branch. If both represent the same mount data, consider unifying the usage for consistency.
self.syscall.mount(
Some(&*src),
dest,
typ,
mount_option_config.flags,
Some(&mount_option_config.data),
|
@z63d It seems the format error occurred. |
4681d5f to
560853b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This pull request addresses the mount handling issues by introducing a generic retry function and updating the error handling logic for mount operations. Key changes include:
- Adding a retry function in utils.rs to allow configurable retry logic.
- Modifying mount_into_container in mount.rs to handle EINVAL and EBUSY errors differently with retry support.
- Renaming the test function to align with the updated function name.
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| crates/libcontainer/src/utils.rs | Added a generic retry function with delay and policy support. |
| crates/libcontainer/src/rootfs/mount.rs | Updated mount error handling using the new retry function and adjusted test names. |
| }; | ||
| // runc has a retry interval of 100ms. We are following this. | ||
| // https://github.com/opencontainers/runc/blob/v1.3.0/libcontainer/rootfs_linux.go#L1235 | ||
| let delay = Duration::from_millis(100); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Please make 100ms a constant.
utam0k
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One nit, otherwise looks good.
560853b to
5f07e0a
Compare
crates/libcontainer/src/utils.rs
Outdated
| pub fn retry<F, T, E, P>( | ||
| mut op: F, | ||
| attempts: u32, | ||
| #[cfg_attr(test, allow(unused_variables))] delay: Duration, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey, I think it should be ok to keep the delay in tests as well. with 100 ms delay and 2 attempts after delay, each test would take at most 200 ms ; even if there are 10 tests we might have a delay of 2 sec total, I think that is ok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is certainly true.
I'd like to hear @utam0k's opinion.
#3157 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, I didn't see that comment. @utam0k I personally feel that waiting 100 ms should be ok. If not preferable, I'd suggest instead of changing fn signature use cfg to re-define the const to diff value, i.e.
#[cfg(not(test))]
const MOUNT_RETRY_DELAY_MS: u64 = 100;
#[cfg(test)]
const MOUNT_RETRY_DELAY_MS: u64 = 1;
I think this would be cleaner than using cfg in fn signature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My concern is that if this function is used in many places, the waiting time will increase. However, since this is not the case now, it is acceptable to just leave the comment and make it const.
5f07e0a to
251a1c4
Compare
Signed-off-by: Kaita Nakamura <[email protected]>
Signed-off-by: Kaita Nakamura <[email protected]>
251a1c4 to
216f50a
Compare
Description
Fixes mount_into_container handling.
Also, since the function name seems to have changed in the #378, I adjusted the test function name to match.
Type of Change
Testing
Related Issues
#3132
Additional Context