Skip to content

feat: implement zsh shell tool via shell-escalation#12584

Closed
bolinfest wants to merge 1 commit intomainfrom
pr12584
Closed

feat: implement zsh shell tool via shell-escalation#12584
bolinfest wants to merge 1 commit intomainfrom
pr12584

Conversation

@bolinfest
Copy link
Copy Markdown
Collaborator

@bolinfest bolinfest commented Feb 23, 2026

Why

The shell tool needed a dedicated Unix-only zsh path that can perform privileged escalation only when explicitly enabled, instead of relying on the legacy zsh bridge path.

What changed

  • Added a ShellToolRuntime path that activates only when all three conditions are met: Unix platform, feature: shell_zsh_fork, and configured zsh_path.
  • Added run_escalate_server()-based execution via codex-shell-escalation with an EscalationPolicyFactory suitable for shell tool semantics.
  • Wired arg0 execution mode for codex-execve-wrapper and integrated the new server/client flow for shell command execution.
  • Removed legacy core/src/zsh_exec_bridge originally introduced in feat(core): zsh exec bridge #12052 to complete the migration away from the old implementation.

Verification

  • Confirmed stack behavior at this commit with just clippy and ensured this commit remains clean with the shell tool integration enabled.

Stack created with Sapling. Best reviewed with ReviewStack.

Copy link
Copy Markdown
Contributor

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7980a8f767

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +450 to +454
let (_, args) = env
.command
.split_first()
.ok_or_else(|| ToolError::Rejected("command args are empty".to_string()))?;
let script = shlex_try_join(args.iter().map(String::as_str))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Pass only script text to escalate server

In the zsh-fork path, env.command already includes the shell invocation (and may include sandbox wrapper args), but this code serializes all trailing argv into script and then run_escalate_server wraps it again as <zsh> -c <script>. For a normal command like ['/bin/zsh','-lc','echo hi'], this produces a script beginning with -lc, so zsh tries to execute -lc as a command and the shell tool fails. This breaks shell execution whenever ShellZshFork is enabled; only the original script payload should be forwarded here.

Useful? React with 👍 / 👎.

@bolinfest bolinfest force-pushed the pr12584 branch 3 times, most recently from ad1e71d to 4e1b91d Compare February 23, 2026 17:09
bolinfest added a commit that referenced this pull request Feb 23, 2026
…lation (#12556)

## Why
Shell execution refactoring in `exec-server` had become split between
duplicated code paths, which blocked a clean introduction of the new
reusable shell escalation flow. This commit creates a dedicated
foundation crate so later shell tooling changes can share one
implementation.

## What changed
- Added the `codex-shell-escalation` crate and moved the core escalation
pieces (`mcp` protocol/socket/session flow, policy glue) that were
previously in `exec-server` into it.
- Normalized `exec-server` Unix structure under a dedicated `unix`
module layout and kept non-Unix builds narrow.
- Wired crate/build metadata so `shell-escalation` is a first-class
workspace dependency for follow-on integration work.

## Verification
- Built and linted the stack at this commit point with `just clippy`.

[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/12556).
* #12584
* #12583
* __->__ #12556
@bolinfest bolinfest force-pushed the pr12584 branch 2 times, most recently from 627d9f1 to f88d0ae Compare February 23, 2026 18:02
@bolinfest bolinfest force-pushed the pr12584 branch 2 times, most recently from a8f0059 to 80d71b5 Compare February 23, 2026 18:12
bolinfest added a commit that referenced this pull request Feb 23, 2026
## Why
Tool handlers and runtimes needed to pass the same turn/session context
for shell and non-shell workflows without duplicative ownership churn.
Using shared pointers avoids temporary lifetimes and keeps existing
behavior unchanged while simplifying call sites.

## What changed
- Converted `ToolCtx` to store shared context handles (`Arc`-based),
including updates across shell, apply-patch, and unified-exec paths.
- Updated orchestrator/runtime call sites to consume the shared context
consistently and remove brittle move/borrow patterns.
- Kept behavior unchanged while preparing the type surface for the new
shell escalation integration in the next stack commit.

## Verification
- Validated this commit stack point with `just clippy` and confirmed
workspace compiles cleanly in this stack state.

[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/12583).
* #12584
* __->__ #12583
* #12556
Base automatically changed from pr12583 to main February 23, 2026 18:29
@bolinfest bolinfest force-pushed the pr12584 branch 3 times, most recently from a354fe4 to 2321db1 Compare February 23, 2026 19:37
]),
&zsh_path,
)?;
write_models_cache(&codex_home)?;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

little surprised this is required.

Comment on lines +155 to 159
let read = match guard.try_io(|inner| inner.get_ref().recv(&mut payload[filled..])) {
Ok(Ok(read)) => read,
Ok(Err(err)) => return Err(err),
Err(_would_block) => continue,
};
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uber nit:

let read = match guard.try_io(|inner| inner.get_ref().recv(&mut payload[filled..])) {
Ok(result) => result?,
Err(_would_block) => continue,
};

))
}

fn extract_shell_script(command: &[String]) -> Result<String, ToolError> {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reuse extract_bash_command ?

.await
.map_err(|err| ToolError::Rejected(err.to_string()))?;

map_exec_result(attempt.sandbox, exec_result).map(Some)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove option from the return type?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, why remap? ExecToolCallOutput is pretty easy to construct.

.chain(argv.iter().cloned())
.collect::<Vec<_>>();
let (commands, used_complex_parsing) =
if let Some(commands) = parse_shell_lc_plain_commands(&command) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

surprised we are still re-parsing commands

}
}
}
Decision::Allow => EscalateAction::Run,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why isn't this escalate?

Comment on lines +186 to +191
let evaluation = policy.check_multiple(commands.iter(), &fallback);
let decision_driven_by_policy =
Self::decision_driven_by_policy(&evaluation.matched_rules, evaluation.decision);
let needs_escalation =
self.sandbox_permissions.requires_escalated_permissions() || decision_driven_by_policy;

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this may be unrelated to the change but the combination of check_multiple and decision_driven_by_policy is confusing.

Like the fact that Decision::Prompt + decision_driven_by_policy == no prompt, just escalate

}
}

fn shell_execve_wrapper() -> anyhow::Result<PathBuf> {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uber nit: anyhow results in core not ideal.

req.sandbox_permissions,
req.justification.clone(),
)?;
let _sandbox_exec_request = attempt
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why discard, aren't we loosing some env vars?

struct McpShellCommandExecutor;

#[async_trait::async_trait]
impl ShellCommandExecutor for McpShellCommandExecutor {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we still keeping the mcp around?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants