yamux/frame: Ensure frame io does not panic on invalid write results#202
yamux/frame: Ensure frame io does not panic on invalid write results#202jxs merged 3 commits intolibp2p:masterfrom
Conversation
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
jxs
left a comment
There was a problem hiding this comment.
Hi, and thanks for this! LGTM can you just address the clippy lints? Elena want to also give it a review?
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
elenaf9
left a comment
There was a problem hiding this comment.
LGTM as a quick fix.
It is unclear at the moment of writing if the issue comes from: higher levels of the yamux crate, websocket tokio tungstenite (unconfirmed if this affects only the websocket or not), or the buffering of sockets from litep2p.
Would be great if we could investigate this, but not sure if/ when I'll get to it.
|
@jxs Would it be possible to include this in a release on crates.io? 🙏 |
|
Hi @lexnv yeah no worries, can do it later today. Btw can you send me an email with your contact? Would like ask and discuss somethings with you |
|
Hi @jxs I've sent you an email, thanks for handling the release 🙏 |
…plementation (#518) The PR fixes a `AsyncWrite::poll_write` implementation of the crypto/noise sockets that causes panics in rust-yamux and leads to unnecessary connection closure and instability: - T0: poll_write is called with buffer of len 512 bytes - the implementation encrypt data and buffers it - because the io socket is not ready, poll pending is returned. - T1: tokio_tungstenite (or other component) decides that a new payload must be sent (usually a PONG frame) of len 12 bytes - because the inner buffers contain the previous message, upon flushing the size of the older message is returned (ie 512) - rust-yamux uses the 512 bytes to index a buffer of 12 bytes (as expected by the second poll_write with buffer 12) - This caused frequent panics on the websocket implementation, which is currently addressed as abrupt connection termination This PR fixes the `AsyncWrite` contract violation by effectively decoupling the encryption from the writing steps. - the inner buffers are drained as much as possible (until the socket returns poll pending) - The provided message is encrypted into the inner buffer if it has capacity and the number of bytes is returned immediately - a subsequent poll_write or poll_flush or poll_close will propagate the encrypted buffered data to the underlying socket Previous fixes: - libp2p/rust-yamux#202 - libp2p/rust-yamux#211 The fixes are still needed since the tokio-tungstenite (websocket crate) was not properly scoped and may exhibit similar behavior. cc @paritytech/networking --------- Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
This PR ensures that the Sink implementation of the yamux-frame does not panic when the returned number of bytes from a write operation is bigger than the header size.
We have noticed this behavior on live chains in polkadot as reported by:
The issue shows that in the
WriteState::Headerstate, the inner IO socket returned an offset bigger than the length of the header during thepoll_writeoperation. This then leads to incrementing the offset past the header capacity in a following call. While at it, have ensured that theWriteState::Bodystate is safe from panics as well.It is unclear at the moment of writing if the issue comes from: higher levels of the yamux crate, websocket tokio tungstenite (unconfirmed if this affects only the websocket or not), or the buffering of sockets from litep2p.
However, considering this is a one-off issue and we have not seen it in Kusama for the past 6+months of testing, terminating the connection in these cases seems like a sensible thing to do.
Would love to get your thoughts on this @jxs @elenaf9 🙏