Skip to content

fix(tools): harden apple container bootstrap execs#405

Merged
penso merged 2 commits intomainfrom
issue-159
Mar 12, 2026
Merged

fix(tools): harden apple container bootstrap execs#405
penso merged 2 commits intomainfrom
issue-159

Conversation

@penso
Copy link
Copy Markdown
Collaborator

@penso penso commented Mar 11, 2026

Fixes #159.

Summary

  • harden Apple Container exec command construction so provisioning also runs from /tmp and re-creates /home/sandbox before shell commands run
  • centralize container exec shell argument building so Apple Container and Docker paths stay aligned without duplicating the old unsafe container exec shape
  • add regression coverage for Apple Container bootstrap/run/exec arguments and make test_create_sandbox_off deterministic so it does not depend on a real backend being healthy on macOS hosts

Validation

Completed

  • cargo test -p moltis-tools test_create_sandbox_off -- --nocapture
  • cargo test -p moltis-tools apple_container -- --nocapture
  • cargo test -p moltis-tools container_exec_shell_args -- --nocapture
  • just format
  • cargo +nightly-2025-11-30 clippy -Z unstable-options --workspace --all-targets --timings -- -D warnings

Remaining

  • Apple Container runtime smoke test on macOS with persisted /home/sandbox
  • just lint on a machine with CUDA / nvcc available for the workspace --all-features path

Manual QA

  • Not run locally beyond unit tests and linting in this worktree.
  • Recommended follow-up: start Moltis with Apple Container, trigger sandbox execution, and verify both fresh and persisted-home sessions succeed.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 11, 2026

Greptile Summary

This PR hardens Apple Container bootstrap and exec commands by ensuring /home/sandbox is (re)created before every shell command runs inside the container. It also fixes a pre-existing bug where provision_packages always used the Docker-style exec shape even when the container CLI was container, now resolved by the new container_exec_shell_args dispatcher. An apple_container_wrap_shell_command helper centralises the mkdir -p prefix so it is consistent across bootstrap, exec, and provisioning paths.

Key changes:

  • apple_container_wrap_shell_command – new helper that prepends mkdir -p /home/sandbox && to any command destined for Apple Container.
  • container_exec_shell_args – new dispatcher that routes provision_packages (and any future shared callers) to either the Apple Container exec shape (with --workdir /tmp and the home-dir guard) or the standard Docker/Podman shape, selected by the cli string.
  • #[cfg(any(target_os = "macos", test))] – all Apple Container constants and helpers are now also compiled in test builds on Linux, allowing the regression tests to run in CI without a macOS runner.
  • Regression tests added for apple_container_exec_args, apple_container_bootstrap_command, apple_container_run_args, and both dispatch paths of container_exec_shell_args.

Concern:
The readiness probe (probe_container_exec_ready) now runs mkdir -p /home/sandbox && true on every call (up to 20 times during startup) because apple_container_exec_args unconditionally wraps all commands. A transient filesystem state during early container boot could cause mkdir -p to fail even when exec itself is available, leading to spurious "not exec-ready" errors. See inline comment for details.

Confidence Score: 4/5

  • This PR is safe to merge with one minor concern around the readiness probe gaining an unintended side-effect.
  • The fix is targeted, well-reasoned, and accompanied by new cross-platform regression tests. The core logic change (routing provision_packages through container_exec_shell_args to get the --workdir /tmp flag on Apple Container) directly addresses the reported bug. The one concern is that probe_container_exec_ready now runs mkdir -p /home/sandbox && true instead of a pure true, which could cause spurious probe failures during the earliest milliseconds of container startup if the filesystem isn't fully ready. This is not a regression in normal operation but could surface as flaky container-start failures under load or on slow machines.
  • Pay close attention to crates/tools/src/sandbox.rs around probe_container_exec_ready (line 4150) and the new container_exec_shell_args dispatcher (line 1102).

Important Files Changed

Filename Overview
crates/tools/src/sandbox.rs Introduces apple_container_wrap_shell_command helper, centralises exec-arg building into container_exec_shell_args, fixes provision_packages to use Apple Container–specific args, and broadens #[cfg] gates to any(target_os = "macos", test) so regression tests run cross-platform. Changes are well-targeted with matching new tests; one minor concern around the apt-get install exit-code masking (pre-existing) and the readiness probe now carrying a mkdir -p side-effect.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[provision_packages / exec call] --> B{container_exec_shell_args\ncli, container_name, shell_command}
    B -- "cfg(macos|test)\ncli == 'container'" --> C[apple_container_exec_args]
    B -- "otherwise\ne.g. docker / podman" --> D[Docker-style exec vec\nexec, name, sh, -c, cmd]
    C --> E[apple_container_wrap_shell_command\nmkdir -p /home/sandbox && cmd]
    E --> F[Return:\nexec --workdir /tmp name sh -c\nmkdir -p /home/sandbox && cmd]
    D --> G[Return:\nexec name sh -c cmd]

    H[apple_container_exec / probes] --> C
    I[apple_container_bootstrap_command] --> E
Loading

Comments Outside Diff (1)

  1. crates/tools/src/sandbox.rs, line 4150 (link)

    Readiness probe gains a mkdir -p side-effect

    Because apple_container_exec_args now always prepends mkdir -p /home/sandbox && via apple_container_wrap_shell_command, the readiness probe here is no longer a lightweight true; it runs mkdir -p /home/sandbox && true. This is called in a tight loop up to 20 × 100 ms.

    Two potential consequences:

    1. If, during the container's early boot window, the container filesystem is still in a transient state where mkdir -p fails (e.g., the overlay isn't mounted yet, or a permissions quirk in the image), the probe reports failure even though exec itself is available. This could cause spurious "did not become exec-ready" errors and needless container removes/retries.
    2. The diagnostic intent of the probe is obscured — a failure here now blends "exec not available" with "cannot create home dir", making debugging harder.

    Consider using a dedicated, side-effect-free probe (e.g., calling apple_container_exec_args with the raw "true" string before wrapping, or adding a separate apple_container_probe_args helper) so that readiness failures are unambiguously about exec availability:

    async fn probe_container_exec_ready(name: &str) -> Result<()> {
        // Use a raw exec probe without the /home/sandbox mkdir side-effect so
        // failures here are unambiguously about exec availability, not filesystem state.
        let args = vec![
            "exec".to_string(),
            "--workdir".to_string(),
            APPLE_CONTAINER_SAFE_WORKDIR.to_string(),
            name.to_string(),
            "sh".to_string(),
            "-c".to_string(),
            "true".to_string(),
        ];
        ...
    }

Last reviewed commit: 5c5f498

@codspeed-hq
Copy link
Copy Markdown
Contributor

codspeed-hq bot commented Mar 11, 2026

Merging this PR will not alter performance

✅ 39 untouched benchmarks
⏩ 5 skipped benchmarks1


Comparing issue-159 (e41e748) with main (742bce9)

Open in CodSpeed

Footnotes

  1. 5 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5c5f498e06

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines 6801 to 6802
#[test]
fn test_should_restart_after_readiness_error() {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Re-add macOS gating for restart-readiness test

This test now executes on every target, but it calls should_restart_after_readiness_error and ContainerState, which remain behind #[cfg(target_os = "macos")] in this file, so non-macOS test runs fail at compile time before any tests execute (I reproduced this with cargo test -p moltis-tools container_exec_shell_args -- --nocapture). Please gate this test with the same macOS cfg (or make the referenced symbols available under test) so Linux CI can compile the test target.

Useful? React with 👍 / 👎.

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 11, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@penso penso merged commit fa66941 into main Mar 12, 2026
50 of 54 checks passed
@penso penso deleted the issue-159 branch March 12, 2026 22:39
penso added a commit that referenced this pull request Mar 23, 2026
* fix(tools): harden apple container bootstrap execs

* test(tools): make create_sandbox_off deterministic
jmikedupont2 pushed a commit to meta-introspector/moltis that referenced this pull request Mar 23, 2026
* fix(tools): harden apple container bootstrap execs

* test(tools): make create_sandbox_off deterministic
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Apple Container sandbox fails with WORKDIR chdir error

1 participant