Skip to content

Fix DataReader.findNewLine with multiple lone EOL character#9362

Merged
romain-grecourt merged 1 commit into
helidon-io:mainfrom
romain-grecourt:datareader-findnewline-multiple-lone-eol
Oct 14, 2024
Merged

Fix DataReader.findNewLine with multiple lone EOL character#9362
romain-grecourt merged 1 commit into
helidon-io:mainfrom
romain-grecourt:datareader-findnewline-multiple-lone-eol

Conversation

@romain-grecourt

@romain-grecourt romain-grecourt commented Oct 11, 2024

Copy link
Copy Markdown
Contributor

Description

DataReader.findNewLine returns max instead of new line index when there are multiple lone CR characters in the same buffer.

E.g. \r\r\r\n

The virtual index is incorrectly accumulated each time with the relative index, its value may exceed max which will skip any valid CRLN within max.

  • Fix virtual index accumulation
  • Re-use the branch that changes the buffer
  • Add unit tests

Fixes #9365

Documentation

None.

@romain-grecourt romain-grecourt added SE 4.x Version 4.x labels Oct 11, 2024
@romain-grecourt romain-grecourt added this to the 4.2.0 milestone Oct 11, 2024
@romain-grecourt romain-grecourt self-assigned this Oct 11, 2024
@oracle-contributor-agreement oracle-contributor-agreement Bot added the OCA Verified All contributors have signed the Oracle Contributor Agreement. label Oct 11, 2024
@tomas-langer

Copy link
Copy Markdown
Member

What is this trying to fix? There is no issue attached.

The \r is not allowed as a character in HTTP unless it is the end of the line, and that must be \r\n according to the spec.
So in general, we must not allow this.

So please provide an explanation/issue why this is a good solution and a general one. Thanks!

@romain-grecourt

romain-grecourt commented Oct 11, 2024

Copy link
Copy Markdown
Contributor Author

What is this trying to fix? There is no issue attached.

The fix was created before the issue, that's unusual !

The \r is not allowed as a character in HTTP unless it is the end of the line, and that must be \r\n according to the spec. So in general, we must not allow this.

io.helidon.common.buffers.DataReader is part of common/buffers and DataReader.findNewLine() has no HTTP semantics other than newLine means \r\n.

common/buffers shouldn't be responsible for enforcing HTTP semantics.

So please provide an explanation/issue why this is a good solution and a general one. Thanks!

The documented contract is return index of the new line, or max if not found, if the buffer contains \r\n but we return max that's a bug.

With the current code If a buffer contains \r\r\r\n then max is returned instead of the actual new line index.
The calling code will then pull data to search for the new line and that will trigger a pull which may fail.


I will update the issue with more explanations.

@romain-grecourt romain-grecourt force-pushed the datareader-findnewline-multiple-lone-eol branch from 7659bce to c9ddb37 Compare October 12, 2024 03:15
spericas
spericas previously approved these changes Oct 14, 2024

@spericas spericas left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. As I mentioned offline, it may be worth renaming some variables to distinguish between node and stream indexes. Should help understand the calculations better.

When there is multiple lone CR within a buffer, the virtual index is accumulated each time with the node index.
The virtual index may be incorrectly seen as >= max, which ignores CRLN within max.

- Fix virtual index accumulation
- Re-use the branch that changes the buffer
- Renamed indexWithinNode -> fromIndexNode
- Renamed crIndex -> crIndexNode
- Renamed lfIndex -> lfIndexNode
- Add unit tests
@romain-grecourt romain-grecourt merged commit ba97c90 into helidon-io:main Oct 14, 2024
@romain-grecourt romain-grecourt deleted the datareader-findnewline-multiple-lone-eol branch October 14, 2024 22:03
barchetta pushed a commit to barchetta/helidon that referenced this pull request Oct 14, 2024
…io#9362)

When there is multiple lone CR within a buffer, the virtual index is accumulated each time with the node index.
The virtual index may be incorrectly seen as >= max, which ignores CRLN within max.

- Fix virtual index accumulation
- Re-use the branch that changes the buffer
- Renamed indexWithinNode -> fromIndexNode
- Renamed crIndex -> crIndexNode
- Renamed lfIndex -> lfIndexNode
- Add unit tests
barchetta added a commit that referenced this pull request Oct 14, 2024
…9391)

When there is multiple lone CR within a buffer, the virtual index is accumulated each time with the node index.
The virtual index may be incorrectly seen as >= max, which ignores CRLN within max.

- Fix virtual index accumulation
- Re-use the branch that changes the buffer
- Renamed indexWithinNode -> fromIndexNode
- Renamed crIndex -> crIndexNode
- Renamed lfIndex -> lfIndexNode
- Add unit tests

Co-authored-by: Romain Grecourt <romain.grecourt@oracle.com>
arjav-desai pushed a commit to arjav-desai/helidon that referenced this pull request Dec 11, 2024
…io#9362)

When there is multiple lone CR within a buffer, the virtual index is accumulated each time with the node index.
The virtual index may be incorrectly seen as >= max, which ignores CRLN within max.

- Fix virtual index accumulation
- Re-use the branch that changes the buffer
- Renamed indexWithinNode -> fromIndexNode
- Renamed crIndex -> crIndexNode
- Renamed lfIndex -> lfIndexNode
- Add unit tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

4.x Version 4.x OCA Verified All contributors have signed the Oracle Contributor Agreement. SE

Projects

None yet

Development

Successfully merging this pull request may close these issues.

InsufficientDataAvailableException while reading Multipart request part data

3 participants