Skip to content

yamux/frame: Ensure frame io does not panic on invalid write results#202

Merged
jxs merged 3 commits intolibp2p:masterfrom
lexnv:lexnv/guard-out-of-range
Jul 17, 2025
Merged

yamux/frame: Ensure frame io does not panic on invalid write results#202
jxs merged 3 commits intolibp2p:masterfrom
lexnv:lexnv/guard-out-of-range

Conversation

@lexnv
Copy link
Copy Markdown
Contributor

@lexnv lexnv commented Jul 14, 2025

This PR ensures that the Sink implementation of the yamux-frame does not panic when the returned number of bytes from a write operation is bigger than the header size.

We have noticed this behavior on live chains in polkadot as reported by:

The issue shows that in the WriteState::Header state, the inner IO socket returned an offset bigger than the length of the header during the poll_write operation. This then leads to incrementing the offset past the header capacity in a following call. While at it, have ensured that the WriteState::Body state is safe from panics as well.

It is unclear at the moment of writing if the issue comes from: higher levels of the yamux crate, websocket tokio tungstenite (unconfirmed if this affects only the websocket or not), or the buffering of sockets from litep2p.

However, considering this is a one-off issue and we have not seen it in Kusama for the past 6+months of testing, terminating the connection in these cases seems like a sensible thing to do.

Would love to get your thoughts on this @jxs @elenaf9 🙏

Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Copy link
Copy Markdown
Member

@jxs jxs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, and thanks for this! LGTM can you just address the clippy lints? Elena want to also give it a review?

Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Copy link
Copy Markdown
Member

@elenaf9 elenaf9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM as a quick fix.

It is unclear at the moment of writing if the issue comes from: higher levels of the yamux crate, websocket tokio tungstenite (unconfirmed if this affects only the websocket or not), or the buffering of sockets from litep2p.

Would be great if we could investigate this, but not sure if/ when I'll get to it.

@jxs jxs merged commit 9cab6c9 into libp2p:master Jul 17, 2025
3 checks passed
@lexnv
Copy link
Copy Markdown
Contributor Author

lexnv commented Jul 21, 2025

@jxs Would it be possible to include this in a release on crates.io? 🙏

@jxs
Copy link
Copy Markdown
Member

jxs commented Jul 23, 2025

Hi @lexnv yeah no worries, can do it later today. Btw can you send me an email with your contact? Would like ask and discuss somethings with you

@lexnv
Copy link
Copy Markdown
Contributor Author

lexnv commented Jul 24, 2025

Hi @jxs I've sent you an email, thanks for handling the release 🙏

lexnv added a commit to paritytech/litep2p that referenced this pull request Jan 20, 2026
…plementation (#518)

The PR fixes a `AsyncWrite::poll_write` implementation of the
crypto/noise sockets that causes panics in rust-yamux and leads to
unnecessary connection closure and instability:
- T0: poll_write is called with buffer of len 512 bytes
  - the implementation encrypt data and buffers it
  - because the io socket is not ready, poll pending is returned.
- T1: tokio_tungstenite (or other component) decides that a new payload
must be sent (usually a PONG frame) of len 12 bytes
- because the inner buffers contain the previous message, upon flushing
the size of the older message is returned (ie 512)
- rust-yamux uses the 512 bytes to index a buffer of 12 bytes (as
expected by the second poll_write with buffer 12)
- This caused frequent panics on the websocket implementation, which is
currently addressed as abrupt connection termination
  
This PR fixes the `AsyncWrite` contract violation by effectively
decoupling the encryption from the writing steps.
- the inner buffers are drained as much as possible (until the socket
returns poll pending)
- The provided message is encrypted into the inner buffer if it has
capacity and the number of bytes is returned immediately
- a subsequent poll_write or poll_flush or poll_close will propagate the
encrypted buffered data to the underlying socket

Previous fixes:
- libp2p/rust-yamux#202
- libp2p/rust-yamux#211

The fixes are still needed since the tokio-tungstenite (websocket crate)
was not properly scoped and may exhibit similar behavior.

cc @paritytech/networking

---------

Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants