Skip to content

[UNDERTOW-2655] Fix text corruption in FileUtils.readFile when reading multi-byte characters#1834

Merged
fl4via merged 1 commit into
undertow-io:mainfrom
finalchild:UNDERTOW-2655
Mar 3, 2026
Merged

[UNDERTOW-2655] Fix text corruption in FileUtils.readFile when reading multi-byte characters#1834
fl4via merged 1 commit into
undertow-io:mainfrom
finalchild:UNDERTOW-2655

Conversation

@finalchild
Copy link
Copy Markdown
Contributor

@finalchild finalchild commented Oct 28, 2025

Summary

Fixes text corruption in FileUtils.readFile when reading multi-byte UTF-8 characters.

Problem: The original implementation read the InputStream into a fixed-size byte buffer (1024 bytes) and decoded each chunk independently. When a multi-byte character sequence was split across a buffer boundary, the decoder received incomplete character data, resulting in replacement characters (�) in the final string.

Solution: Replaced BufferedInputStream with InputStreamReader to handle buffering and character decoding together in a streaming fashion, ensuring multi-byte character sequences are never split.

Note: The implementation is copied from Java 25's InputStreamReader#readAllAsString.

This issue became more significant after fixing UNDERTOW-2337, as large form-data field values are now processed by this vulnerable function. Originally reported in Spring Framework issue #35292.

Issue: UNDERTOW-2655

@fl4via fl4via added the bug fix Contains bug fix(es) label Oct 28, 2025
Copy link
Copy Markdown
Member

@fl4via fl4via left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @finalchild ! thanks for your PR, can you please create a test for the fix?

@fl4via fl4via added the waiting PR update Awaiting PR update(s) from contributor before merging label Oct 29, 2025
@finalchild
Copy link
Copy Markdown
Contributor Author

@fl4via
Added tests!

@finalchild
Copy link
Copy Markdown
Contributor Author

@baranowb @fl4via can you follow this again?

@fl4via
Copy link
Copy Markdown
Member

fl4via commented Feb 16, 2026

hello @finalchild ! We are doing a big bulky task that is backporting of PRs amongst all the branches (and between undertow <-> undertow-ee) the fixes that were added to a few of them (basically, 2.2.x and 2.3.x).

There are also PRs in 2.4.x that were not present in main, so I'm creating PRs for those.

Once we are done with that task, we will release 2.4.0.Beta1 with the same fixes that went in 2.3.23.Final and 2.2.39.Final. Doing it this way helps us keep the releases consistent in terms of the fixes that are contained each one of them.

As soon as this is done, we will review and merge all the withheld PRs, including yours. I know we have a big number of PRs in the line right now, I kindly ask you to bear with me just a little longer and then the fixes will be fully processed.

After all that, 2.3.24.Final, 2.2.40.Final and 2.4.0.Beta2 will follow, containing all the great work from community contributors and project maintainers. That way, your fix will be available to all Undertow users.

In terms of time frame, it all depends on how fast I can move with the PRs, but I am hoping it won't take long.

Thank you for your great contribution!

@fl4via
Copy link
Copy Markdown
Member

fl4via commented Mar 3, 2026

Hi @finalchild ! Again, thank you for your PR. We are finally at the moment where we will be merging and backporting it. I appreciate you waiting this long. I hope it is worthwhile when we have Undertow 2.4.0.Final released with all the pending fixes in the PR line.

…g multi-byte characters

The readFile method was reading the InputStream into a fixed-size byte buffer and decoding each chunk independently. This caused multi-byte UTF-8 character sequences to be split across buffer boundaries, resulting in text corruption with replacement characters.

Replaced BufferedInputStream with InputStreamReader to handle buffering and character decoding together in a streaming fashion, ensuring multi-byte character sequences are never split.

This issue became more significant after UNDERTOW-2337, as large form-data field values are now processed by this function. Originally reported in Spring Framework issue #35292.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
@fl4via fl4via added waiting CI check Ready to be merged but waiting for CI check next release This PR will be merged before next release or has already been merged (for payload double check) and removed waiting PR update Awaiting PR update(s) from contributor before merging waiting CI check Ready to be merged but waiting for CI check labels Mar 3, 2026
@fl4via fl4via merged commit c9c1880 into undertow-io:main Mar 3, 2026
85 of 86 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug fix Contains bug fix(es) next release This PR will be merged before next release or has already been merged (for payload double check)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants