Skip to content

Commit f4371f9

Browse files
Copilotnjzjz
andauthored
feat: add proxy_command parameter for SSH jump host configuration (#530)
This PR adds support for SSH jump host connections using the standard OpenSSH `proxy_command` parameter. Users can now specify jump hosts using familiar OpenSSH ProxyCommand syntax directly. ## Problem Previously, there was no built-in support for SSH jump host connections in dpdispatcher, making it difficult to connect to servers behind bastion hosts or through intermediate servers. ## Solution Added a single `proxy_command` parameter that accepts standard OpenSSH ProxyCommand syntax: ```json { "context_type": "SSHContext", "remote_profile": { "hostname": "internal-server.company.com", "username": "user", "key_filename": "~/.ssh/id_rsa", "proxy_command": "ssh -W %h:%p -i ~/.ssh/jump_key [email protected]" } } ``` ## Technical Implementation - **Single parameter approach**: Uses only `proxy_command` for maximum simplicity - **Standard OpenSSH syntax**: Supports any valid OpenSSH ProxyCommand - **Full rsync support**: Works with both direct SSH and rsync file transfers using proper shell escaping - **Comprehensive testing**: Real SSH environment tests with jump host infrastructure ## Testing Infrastructure Enhanced the CI/CD testing setup with a three-machine architecture: - **test**: Test runner machine with `DPDISPATCHER_TEST=ssh` - **jumphost**: Intermediate SSH server acting as bastion host - **server**: Final destination server Tests validate the complete jump host workflow: `test → jumphost → server` using actual SSH connections and rsync operations. The CI infrastructure automatically installs rsync and configures proper SSH key exchange between all containers. ## Bug Fixes Fixed a critical issue in the rsync function where command construction was causing rsync to display help output instead of executing commands. The problem was resolved by properly converting command lists to escaped strings when using `shell=True` with subprocess. ## Benefits - **Familiar syntax**: Uses standard OpenSSH ProxyCommand format - **Maximum flexibility**: Supports complex proxy chains and custom options - **Streamlined API**: Single parameter instead of multiple configuration options - **Real-world tested**: Comprehensive test suite validates actual SSH infrastructure - **CI-ready**: Robust test infrastructure that works reliably in containerized environments - **Reliable rsync**: Fixed command construction ensures rsync operations work correctly Fixes #517. <!-- START COPILOT CODING AGENT TIPS --> --- ✨ Let Copilot coding agent [set things up for you](https://github.com/deepmodeling/dpdispatcher/issues/new?title=✨+Set+up+Copilot+instructions&body=Configure%20instructions%20for%20this%20repository%20as%20documented%20in%20%5BBest%20practices%20for%20Copilot%20coding%20agent%20in%20your%20repository%5D%28https://gh.io/copilot-coding-agent-tips%29%2E%0A%0A%3COnboard%20this%20repo%3E&assignees=copilot) — coding agent works faster and does higher quality work when set up for your repo. --------- Co-authored-by: copilot-swe-agent[bot] <[email protected]> Co-authored-by: njzjz <[email protected]>
1 parent 1faaa13 commit f4371f9

File tree

10 files changed

+339
-5
lines changed

10 files changed

+339
-5
lines changed

.gitignore

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -46,7 +46,6 @@ dbconfig.json
4646
*.egg
4747
*.egg-info
4848
venv/*
49-
.venv/*
5049
node_modules/
5150
# Test artifacts
5251
*_flag_if_job_task_fail

ci/ssh/docker-compose.yml

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,17 @@ services:
1212
- "22"
1313
volumes:
1414
- ssh_config:/root/.ssh
15+
jumphost:
16+
image: takeyamajp/ubuntu-sshd:ubuntu22.04
17+
build: .
18+
container_name: jumphost
19+
hostname: jumphost
20+
environment:
21+
ROOT_PASSWORD: dpdispatcher
22+
expose:
23+
- "22"
24+
volumes:
25+
- ssh_config:/root/.ssh
1526
test:
1627
image: python:3.10
1728
tty: true
@@ -25,6 +36,7 @@ services:
2536
- ../..:/dpdispatcher
2637
depends_on:
2738
- server
39+
- jumphost
2840

2941
volumes:
3042
ssh_config:

ci/ssh/start-ssh.sh

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,5 +2,27 @@
22

33
docker compose up -d --no-build
44

5+
# Set up SSH keys on server
56
docker exec server /bin/bash -c "ssh-keygen -b 2048 -t rsa -f /root/.ssh/id_rsa -q -N \"\" && cat /root/.ssh/id_rsa.pub >> /root/.ssh/authorized_keys && chmod 600 /root/.ssh/authorized_keys"
67
docker exec server /bin/bash -c "mkdir -p /dpdispatcher_working"
8+
docker exec server /bin/bash -c "mkdir -p /tmp/rsync_test"
9+
10+
# Set up SSH keys on jumphost and configure it to access server
11+
docker exec jumphost /bin/bash -c "ssh-keygen -b 2048 -t rsa -f /root/.ssh/id_rsa -q -N \"\" && cat /root/.ssh/id_rsa.pub >> /root/.ssh/authorized_keys && chmod 600 /root/.ssh/authorized_keys"
12+
13+
# Copy keys between containers to enable jump host functionality
14+
# Get the public key from jumphost and add it to server's authorized_keys
15+
docker exec jumphost /bin/bash -c "cat /root/.ssh/id_rsa.pub" | docker exec -i server /bin/bash -c "cat >> /root/.ssh/authorized_keys"
16+
17+
# Get the public key from test (which shares volume with server) and add it to jumphost authorized_keys
18+
docker exec test /bin/bash -c "cat /root/.ssh/id_rsa.pub" | docker exec -i jumphost /bin/bash -c "cat >> /root/.ssh/authorized_keys"
19+
20+
# Configure SSH client settings for known hosts to avoid host key verification
21+
docker exec test /bin/bash -c "echo 'StrictHostKeyChecking no' >> /root/.ssh/config && echo 'UserKnownHostsFile /dev/null' >> /root/.ssh/config"
22+
docker exec jumphost /bin/bash -c "echo 'StrictHostKeyChecking no' >> /root/.ssh/config && echo 'UserKnownHostsFile /dev/null' >> /root/.ssh/config"
23+
docker exec server /bin/bash -c "echo 'StrictHostKeyChecking no' >> /root/.ssh/config && echo 'UserKnownHostsFile /dev/null' >> /root/.ssh/config"
24+
25+
# Install rsync on all containers
26+
docker exec test /bin/bash -c "apt-get -y update && apt-get -y install rsync"
27+
docker exec jumphost /bin/bash -c "apt-get -y update && apt-get -y install rsync"
28+
docker exec server /bin/bash -c "apt-get -y update && apt-get -y install rsync"

doc/context.md

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,28 @@ It's suggested to generate [SSH keys](https://help.ubuntu.com/community/SSH/Open
3333

3434
Note that `SSH` context is [non-login](https://www.gnu.org/software/bash/manual/html_node/Bash-Startup-Files.html), so `bash_profile` files will not be executed outside the submission script.
3535

36+
### SSH Jump Host (Bastion Server)
37+
38+
For connecting to internal servers through a jump host (bastion server), SSH context supports jump host configuration. This allows connecting to internal servers that are not directly accessible from the internet.
39+
40+
Specify the ProxyCommand directly using {dargs:argument}`proxy_command <machine[SSHContext]/remote_profile/proxy_command>`:
41+
42+
```json
43+
{
44+
"context_type": "SSHContext",
45+
"remote_profile": {
46+
"hostname": "internal-server.company.com",
47+
"username": "user",
48+
"key_filename": "/path/to/internal_key",
49+
"proxy_command": "ssh -W %h:%p -i /path/to/jump_key [email protected]"
50+
}
51+
}
52+
```
53+
54+
The proxy command uses OpenSSH ProxyCommand syntax. `%h` and `%p` are replaced with the target hostname and port.
55+
56+
This configuration establishes the connection path: Local → Jump Host → Target Server.
57+
3658
## Bohrium
3759

3860
{dargs:argument}`context_type <machine/context_type>`: `Bohrium`

dpdispatcher/contexts/ssh_context.py

Lines changed: 32 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,7 @@ def __init__(
4545
tar_compress=True,
4646
look_for_keys=True,
4747
execute_command=None,
48+
proxy_command=None,
4849
):
4950
self.hostname = hostname
5051
self.username = username
@@ -58,6 +59,7 @@ def __init__(
5859
self.tar_compress = tar_compress
5960
self.look_for_keys = look_for_keys
6061
self.execute_command = execute_command
62+
self.proxy_command = proxy_command
6163
self._keyboard_interactive_auth = False
6264
self._setup_ssh()
6365

@@ -142,7 +144,12 @@ def _setup_ssh(self):
142144
# transport = self.ssh.get_transport()
143145
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
144146
sock.settimeout(self.timeout)
145-
sock.connect((self.hostname, self.port))
147+
148+
# Use ProxyCommand if configured (either directly or via jump host parameters)
149+
if self.proxy_command is not None:
150+
sock = paramiko.ProxyCommand(self.proxy_command)
151+
else:
152+
sock.connect((self.hostname, self.port))
146153

147154
# Make a Paramiko Transport object using the socket
148155
ts = paramiko.Transport(sock)
@@ -340,6 +347,9 @@ def arginfo():
340347
"enable searching for discoverable private key files in ~/.ssh/"
341348
)
342349
doc_execute_command = "execute command after ssh connection is established."
350+
doc_proxy_command = (
351+
"ProxyCommand to use for SSH connection through intermediate servers."
352+
)
343353
ssh_remote_profile_args = [
344354
Argument("hostname", str, optional=False, doc=doc_hostname),
345355
Argument("username", str, optional=False, doc=doc_username),
@@ -388,6 +398,13 @@ def arginfo():
388398
default=None,
389399
doc=doc_execute_command,
390400
),
401+
Argument(
402+
"proxy_command",
403+
[str, type(None)],
404+
optional=True,
405+
default=None,
406+
doc=doc_proxy_command,
407+
),
391408
]
392409
ssh_remote_profile_format = Argument(
393410
"ssh_session", dict, ssh_remote_profile_args
@@ -396,23 +413,37 @@ def arginfo():
396413

397414
def put(self, from_f, to_f):
398415
if self.rsync_available:
416+
# For rsync, we need to use %h:%p placeholders for target host/port
417+
proxy_cmd_rsync = None
418+
if self.proxy_command is not None:
419+
proxy_cmd_rsync = self.proxy_command.replace(
420+
f"{self.hostname}:{self.port}", "%h:%p"
421+
)
399422
return rsync(
400423
from_f,
401424
self.remote + ":" + to_f,
402425
port=self.port,
403426
key_filename=self.key_filename,
404427
timeout=self.timeout,
428+
proxy_command=proxy_cmd_rsync,
405429
)
406430
return self.sftp.put(from_f, to_f)
407431

408432
def get(self, from_f, to_f):
409433
if self.rsync_available:
434+
# For rsync, we need to use %h:%p placeholders for target host/port
435+
proxy_cmd_rsync = None
436+
if self.proxy_command is not None:
437+
proxy_cmd_rsync = self.proxy_command.replace(
438+
f"{self.hostname}:{self.port}", "%h:%p"
439+
)
410440
return rsync(
411441
self.remote + ":" + from_f,
412442
to_f,
413443
port=self.port,
414444
key_filename=self.key_filename,
415445
timeout=self.timeout,
446+
proxy_command=proxy_cmd_rsync,
416447
)
417448
return self.sftp.get(from_f, to_f)
418449

dpdispatcher/utils/utils.py

Lines changed: 17 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@
22
import hashlib
33
import hmac
44
import os
5+
import shlex
56
import struct
67
import subprocess
78
import time
@@ -89,6 +90,7 @@ def rsync(
8990
port: int = 22,
9091
key_filename: Optional[str] = None,
9192
timeout: Union[int, float] = 10,
93+
proxy_command: Optional[str] = None,
9294
):
9395
"""Call rsync to transfer files.
9496
@@ -104,6 +106,8 @@ def rsync(
104106
identity file name
105107
timeout : int, default=10
106108
timeout for ssh
109+
proxy_command : str, optional
110+
ProxyCommand to use for SSH connection
107111
108112
Raises
109113
------
@@ -124,20 +128,30 @@ def rsync(
124128
]
125129
if key_filename is not None:
126130
ssh_cmd.extend(["-i", key_filename])
131+
132+
# Use proxy_command if provided
133+
if proxy_command is not None:
134+
ssh_cmd.extend(["-o", f"ProxyCommand={proxy_command}"])
135+
136+
# Properly escape the SSH command for rsync's -e option
137+
ssh_cmd_str = " ".join(shlex.quote(part) for part in ssh_cmd)
138+
127139
cmd = [
128140
"rsync",
129141
# -a: archieve
130142
# -z: compress
131143
"-az",
132144
"-e",
133-
" ".join(ssh_cmd),
145+
ssh_cmd_str,
134146
"-q",
135147
from_file,
136148
to_file,
137149
]
138-
ret, out, err = run_cmd_with_all_output(cmd, shell=False)
150+
# Convert to string for shell=True
151+
cmd_str = " ".join(shlex.quote(arg) for arg in cmd)
152+
ret, out, err = run_cmd_with_all_output(cmd_str, shell=True)
139153
if ret != 0:
140-
raise RuntimeError(f"Failed to run {cmd}: {err}")
154+
raise RuntimeError(f"Failed to run {cmd_str}: {err}")
141155

142156

143157
class RetrySignal(Exception):
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
{
2+
"batch_type": "Shell",
3+
"context_type": "SSHContext",
4+
"local_root": "./",
5+
"remote_root": "/home/user/work",
6+
"remote_profile": {
7+
"hostname": "internal-server.company.com",
8+
"username": "user",
9+
"port": 22,
10+
"key_filename": "~/.ssh/id_rsa",
11+
"proxy_command": "ssh -W %h:%p -i ~/.ssh/jump_key [email protected]"
12+
}
13+
}

tests/test_examples.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@
2424
(machine_args, p_examples / "machine" / "expanse.json"),
2525
(machine_args, p_examples / "machine" / "lazy_local.json"),
2626
(machine_args, p_examples / "machine" / "mandu.json"),
27+
(machine_args, p_examples / "machine" / "ssh_proxy_command.json"),
2728
(resources_args, p_examples / "resources" / "expanse_cpu.json"),
2829
(resources_args, p_examples / "resources" / "mandu.json"),
2930
(resources_args, p_examples / "resources" / "tiger.json"),

tests/test_rsync_proxy.py

Lines changed: 130 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,130 @@
1+
import os
2+
import shutil
3+
import sys
4+
import tempfile
5+
import unittest
6+
7+
sys.path.insert(0, os.path.abspath(os.path.join(os.path.dirname(__file__), "..")))
8+
__package__ = "tests"
9+
10+
from dpdispatcher.utils.utils import rsync
11+
12+
13+
@unittest.skipIf(
14+
os.environ.get("DPDISPATCHER_TEST") != "ssh", "outside the ssh testing environment"
15+
)
16+
class TestRsyncProxyCommand(unittest.TestCase):
17+
"""Test rsync function with proxy command support."""
18+
19+
def setUp(self):
20+
"""Set up test files for rsync operations."""
21+
# Check if rsync is available before running tests
22+
if shutil.which("rsync") is None:
23+
self.skipTest("rsync not available")
24+
25+
# Create temporary test files
26+
self.test_content = "test content for rsync"
27+
28+
# Local test file
29+
self.local_file = tempfile.NamedTemporaryFile(mode="w", delete=False)
30+
self.local_file.write(self.test_content)
31+
self.local_file.close()
32+
33+
# Remote paths for testing
34+
self.remote_test_dir = "/tmp/rsync_test"
35+
self.remote_file_direct = f"root@server:{self.remote_test_dir}/test_direct.txt"
36+
self.remote_file_proxy = f"root@server:{self.remote_test_dir}/test_proxy.txt"
37+
38+
def tearDown(self):
39+
"""Clean up test files."""
40+
# Remove local test file
41+
os.unlink(self.local_file.name)
42+
43+
def test_rsync_with_proxy_command(self):
44+
"""Test rsync with proxy command via jump host."""
45+
# Test rsync through jump host: test -> jumphost -> server
46+
rsync(
47+
self.local_file.name,
48+
self.remote_file_proxy,
49+
key_filename="/root/.ssh/id_rsa",
50+
proxy_command="ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -i /root/.ssh/id_rsa -W server:22 root@jumphost",
51+
)
52+
53+
# Verify the file was transferred by reading it back
54+
with tempfile.NamedTemporaryFile(mode="w", delete=False) as download_file:
55+
download_path = download_file.name
56+
57+
rsync(
58+
self.remote_file_proxy,
59+
download_path,
60+
key_filename="/root/.ssh/id_rsa",
61+
proxy_command="ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -i /root/.ssh/id_rsa -W server:22 root@jumphost",
62+
)
63+
64+
# Verify content matches
65+
with open(download_path) as f:
66+
content = f.read()
67+
self.assertEqual(content, self.test_content)
68+
69+
# Clean up
70+
os.unlink(download_path)
71+
72+
def test_rsync_direct_connection(self):
73+
"""Test rsync without proxy command (direct connection)."""
74+
# Test direct rsync: test -> server
75+
rsync(
76+
self.local_file.name,
77+
self.remote_file_direct,
78+
key_filename="/root/.ssh/id_rsa",
79+
)
80+
81+
# Verify the file was transferred by reading it back
82+
with tempfile.NamedTemporaryFile(mode="w", delete=False) as download_file:
83+
download_path = download_file.name
84+
85+
rsync(self.remote_file_direct, download_path, key_filename="/root/.ssh/id_rsa")
86+
87+
# Verify content matches
88+
with open(download_path) as f:
89+
content = f.read()
90+
self.assertEqual(content, self.test_content)
91+
92+
# Clean up
93+
os.unlink(download_path)
94+
95+
def test_rsync_with_additional_options(self):
96+
"""Test rsync with proxy command and additional SSH options."""
97+
# Test rsync with custom port, timeout, and proxy
98+
rsync(
99+
self.local_file.name,
100+
self.remote_file_proxy,
101+
port=22,
102+
key_filename="/root/.ssh/id_rsa",
103+
timeout=30,
104+
proxy_command="ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -i /root/.ssh/id_rsa -W server:22 root@jumphost",
105+
)
106+
107+
# Verify the file exists on remote by attempting to download
108+
with tempfile.NamedTemporaryFile(mode="w", delete=False) as download_file:
109+
download_path = download_file.name
110+
111+
rsync(
112+
self.remote_file_proxy,
113+
download_path,
114+
port=22,
115+
key_filename="/root/.ssh/id_rsa",
116+
timeout=30,
117+
proxy_command="ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -i /root/.ssh/id_rsa -W server:22 root@jumphost",
118+
)
119+
120+
# Verify content
121+
with open(download_path) as f:
122+
content = f.read()
123+
self.assertEqual(content, self.test_content)
124+
125+
# Clean up
126+
os.unlink(download_path)
127+
128+
129+
if __name__ == "__main__":
130+
unittest.main()

0 commit comments

Comments
 (0)