Use clone instead of fork on pvf#2477
Conversation
mrcnski
left a comment
There was a problem hiding this comment.
Looks really good! This will make us even more secure. 😁
After this PR, I would suggest moving away from security-related tickets. A decision has come to replace wasmtime with polkavm fairly soon: #2461. Polkavm is already secured by default. I mean, it will take some time to switch as we need to do a migration, so it is good that we secured what we have already. But we can focus on other areas now. :)
mrcnski
left a comment
There was a problem hiding this comment.
Nice! Just a few suggestions, and then it should be good!
|
@jpserrat Do you want to take a shot at updating the Implementer's Guide? |
Do you mean, adding the clone change in there? Sure! @mrcnski Just to let you know, I'll be on vacation this week and only going to be on the 12th, so it will take a while until I can go back to this. I made the changes that you asked for and also used the |
|
Awesome, thank you @jpserrat! Thanks for letting me know about your vacation. There's no rush with this PR. :) About |
Hmm. It might be too much of an an implementation detail to mention in the implementer's guide. Low-level details should stay in the code, e.g. module docs. In the impl guide we should just say that the jobs have some additional sandboxing measures that workers don't have. But, we can mention clone in the top-level docs of |
mrcnski
left a comment
There was a problem hiding this comment.
Awesome! It looks really good, now we just need to do the same for execute-worker. 🥳 Please just remember about this comment. We'll need to get the stack size for execution from the executor params, as mentioned in that comment.
…' into jpserrat/clone-instead-of-fork
|
|
alexggh
left a comment
There was a problem hiding this comment.
Overall, looks good to me left you some comments.
| // SIGCHLD flag is used to inform clone that the parent process is | ||
| // expecting a child termination signal, without this flag `waitpid` function | ||
| // return `ECHILD` error. | ||
| CloneFlags::CLONE_NEWCGROUP | |
There was a problem hiding this comment.
From man page:
Only a privileged process (CAP_SYS_ADMIN) can employ CLONE_NEWCGROUP
Is that something we want ?
Not sure if that is already a hard requirement for running polkadot but it feels like it shouldn't be.
There was a problem hiding this comment.
Good catch! I was going to include the CLONE_NEWUSER clone flag, which gives us this capability. From unshare(2):
As with the child process created by [clone(2)](https://man7.org/linux/man-pages/man2/clone.2.html) with the
CLONE_NEWUSER flag, the caller obtains a full set of
capabilities in the new namespace.
However, for reasons I don't understand, clone with this flag fails if we already passed the same flag to unshare here, when sandboxing the parent worker process. So I was going to parametrize this function over whether the unshare capability is present - I just forgot to do it. If it is present, then the parent worker process already has CAP_SYS_ADMIN. That's also whyclone_flags is a fn and not a const @alindima.
There was a problem hiding this comment.
Okay, made this change. If for some reason we cannot unshare, we will need the CAP_SYS_ADMIN so I added the clone flag for that case. Hopefully there are no issues doing so in the wild, because admittedly local testing and CI have not caught all the issues in the past. However, this new capability is optional, so worst-case scenario someone sees a warning and reports it to us.
tdimitrov
left a comment
There was a problem hiding this comment.
Good job! I really enjoyed reviewing this PR.
| } | ||
|
|
||
| /// Returns flags for `clone(2)`, including all the sandbox-related ones. | ||
| fn clone_flags() -> CloneFlags { |
I've tested this manually so should be fine
s0me0ne-unkn0wn
left a comment
There was a problem hiding this comment.
Wow, good job, it's very appreciated!
Left some comments on things that are doubtful to me. It's possible I'm just not getting the reasoning behind them, but right now, some of them look to me like they should be fixed. Please follow up in discussions.
Either way, nicely done!
s0me0ne-unkn0wn
left a comment
There was a problem hiding this comment.
Yes, that looks good now! Thank you!
@mrcnski Done the change on the prepare worker, once the prepare worker part is good I'll do the same for the execute worker.
This is based on https://github.com/koute/polkavm/blob/11beebd06276ce9b84f335350138479e714f6caf/crates/polkavm/src/sandbox/linux.rs#L711.
TODO
Related
Closes #2162