Update LargeFileUploadTask.java by ffcdf · Pull Request #1531 · microsoftgraph/msgraph-sdk-java-core

ffcdf · 2024-03-01T23:35:09Z

Suggested modification to the chunkInputStream method:

From the second pass through this method, the code in the line "int lengthAssert = stream.read(buffer, begin, length);" throws the exception java.lang.IndexOutOfBoundsException when trying to assign the read bytes to a non-existent position in the buffer.

The suggested change implies changing InputStream to FileInputStream throughout the LargeFileUploadTask class.

Fixes # IndexOutOfBoundException in method chunkInputStream

Changes proposed in this pull request

Change InputStream to FileInputStream.
Change
byte[] buffer = new byte[length]; int lengthAssert = stream.read(buffer, begin, length);
to
ByteBuffer byteBuffer = ByteBuffer.allocate(length); int lengthAssert = stream.getChannel().read(byteBuffer); byteBuffer.flip(); byte[] buffer = byteBuffer.array(); byteBuffer.clear();

Other links

Suggested modification to the chunkInputStream method: From the second pass through this method, the code in the line "int lengthAssert = stream.read(buffer, begin, length);" throws the exception java.lang.IndexOutOfBoundsException when trying to assign the read bytes to a non-existent position in the buffer. The suggested change implies changing InputStream to FileInputStream throughout the LargeFileUploadTask class.

ffcdf · 2024-03-01T23:38:38Z

@microsoft-github-policy-service agree

src/main/java/com/microsoft/graph/core/tasks/LargeFileUploadTask.java

According to the documentation: public int read(byte[] b, int off, int len) throws IOException The first byte read is stored into element b[off], the next one into b[off+1], and so on. However, from the second pass through the chunkInputStream method, off is greater than any position of b and therefore, b[off] does not exist.

Ndiritu · 2024-03-04T19:02:28Z

@ffcdf thanks for the changes.

Please update the Changelog as well. You can add a section for 3.1.7.

Ndiritu · 2024-03-04T19:03:17Z

Since there seems to be a follow-up issue, I think we can do without the version bump to prevent releasing something broken?
cc: @ramsessanchez @baywet

baywet · 2024-03-04T19:18:51Z

@Ndiritu we've had multiple complaints about this task, and attempted multiple fixes. (we still have to review this one that attempts fixing a regression microsoft/kiota-java#1109)

I think we're probably missing parts of the equation here. I you could take time to run integration tests for that task (without this PR) against exchange and ODSP (I seem to remember the behaviour was slightly different between the two services), I think it'd go a long way to:

ensuring we're not missing something obvious, and that everything has indeed been fixed.
improving customer's trust in this task's implementation.

A couple of important things to consider during this integration test:

the file should be about 10MBs to ensure multiple upload requests are sent
we should re-download and checksum the file to ensure no logical corruption happens because of the control flow.

Ndiritu · 2024-03-04T19:26:24Z

Sure @baywet. Makes sense to test this extensively before anything is merged/released.

There's a follow-up issue I didn't highlight clearly that was mentioned here in the resolved conversation

baywet · 2024-03-04T19:33:36Z

This is the reason why I suggested integration testing against both services. I seemed like we still have issues with ODSP, but maybe I misread that comment?

baywet · 2024-03-04T19:34:57Z

also, on a somewhat related matter, we should plan a path of graph core and update it all the way (with kiota dependencies) in the service libraries. To ensure nobody is on outdated dependencies moving forward and reporting things we've already fixed. As you mentioned, it's probably part of the confusion around that task (and batching) at the moment.

Ndiritu · 2024-03-04T19:37:21Z

This is the reason why I suggested integration testing against both services. I seemed like we still have issues with ODSP, but maybe I misread that comment?

No we're on the same page now.

calebkiage · 2024-03-05T09:15:52Z

I have a question about the upload task. Are all the chunk lengths guaranteed to exactly match what the reader has available for every chunk? The reason I ask is that (according to the docs), the read contract reads bytes on a best effort basis. So, for any invocation, the read bytes can be less than the length expected.

ffcdf · 2024-03-05T10:38:51Z

@ffcdf thanks for the changes.

Please update the Changelog as well. You can add a section for 3.1.7.

Updated. Pull request 1536.

Ndiritu · 2024-03-05T11:45:15Z

I have a question about the upload task. Are all the chunk lengths guaranteed to exactly match what the reader has available for every chunk? The reason I ask is that (according to the docs), the read contract reads bytes on a best effort basis. So, for any invocation, the read bytes can be less than the length expected.

There's no guarantee that we'll read the expected chunk length from the stream (from my understanding of how read works). We currently fail if what we read doesn't match the length we expected to get. Maybe we should retry reading using read with offsets until we get the expected chunk length or reach the end of the file.

Then set the relevant content-range headers based on this and rely on the API response to determine the next byte position to start from. We currently pre-calculate the ranges before the upload actually begins which might be another bug.

calebkiage · 2024-03-05T12:00:41Z

@Ndiritu, thanks for the info. I actually asked because of the assert. I wondered if it would fail for a valid scenario. Maybe the integration tests can help us pick up situations where the read fails but shouldn't.

Changed chunkInputStream method in LargeFileUploadTask to resolve IndexOutOfBoundsException when uploading large files

Ndiritu · 2024-03-19T12:36:25Z

@ffcdf thank you very much for your contribution! I'll merge this to a separate branch and build on your changes with a few additional fixes.

ihudedi · 2024-04-04T07:22:57Z

Hi @ffcdf @baywet
Are you going to change the InputStream to FileInutStream?
So from now I can't upload large file not from local system?
I have a stream that I need to upload not always from local system.
Thanks,
Itay

baywet · 2024-04-16T12:43:16Z

Deferring to @Ndiritu who's back an in charge of tackling this issue end to end.

Ndiritu · 2024-04-18T15:00:41Z

Hi @ffcdf @baywet Are you going to change the InputStream to FileInutStream? So from now I can't upload large file not from local system? I have a stream that I need to upload not always from local system. Thanks, Itay

Hi @ihudedi, we have kept this generic as an InputStream to cover a broad set of scenarios.

ffcdf requested a review from a team as a code owner March 1, 2024 23:35

calebkiage reviewed Mar 4, 2024

View reviewed changes

src/main/java/com/microsoft/graph/core/tasks/LargeFileUploadTask.java Outdated Show resolved Hide resolved

src/main/java/com/microsoft/graph/core/tasks/LargeFileUploadTask.java Outdated Show resolved Hide resolved

Ndiritu suggested changes Mar 4, 2024

View reviewed changes

src/main/java/com/microsoft/graph/core/tasks/LargeFileUploadTask.java Outdated Show resolved Hide resolved

src/main/java/com/microsoft/graph/core/tasks/LargeFileUploadTask.java Show resolved Hide resolved

ffcdf added 2 commits March 4, 2024 13:54

Update LargeFileUploadTask.java

b2c1e32

Merge branch 'dev' into patch-1

de59819

ffcdf closed this Mar 4, 2024

ffcdf reopened this Mar 4, 2024

ffcdf mentioned this pull request Mar 5, 2024

Update CHANGELOG.md #1536

Closed

baywet mentioned this pull request Mar 5, 2024

solving for content-length request header microsoft/kiota-java#1110

Closed

ffcdf added 4 commits March 6, 2024 14:37

Merge branch 'dev' into patch-1

f3ec5e6

Merge branch 'dev' into patch-1

49c3c20

Update CHANGELOG.md

c24b1b3

Changed chunkInputStream method in LargeFileUploadTask to resolve IndexOutOfBoundsException when uploading large files

Merge branch 'dev' into patch-1

1ebd975

Jarannn mentioned this pull request Mar 14, 2024

LargeFileUpload always produces IndexOutOfBoundsException if the input stream is longer than the chunk size #1543

Closed

baywet linked an issue Mar 18, 2024 that may be closed by this pull request

LargeFileUpload always produces IndexOutOfBoundsException if the input stream is longer than the chunk size #1543

Closed

baywet requested a review from Ndiritu March 18, 2024 17:20

Merge branch 'dev' into patch-1

dea5350

Ndiritu changed the base branch from dev to fix/lfu March 19, 2024 12:34

Ndiritu approved these changes Mar 19, 2024

View reviewed changes

Ndiritu merged commit 697d9b1 into microsoftgraph:fix/lfu Mar 19, 2024

araneolus mentioned this pull request Mar 30, 2024

Upload Large File via LargeFileUploadTask not working microsoftgraph/msgraph-sdk-java#1806

Closed

Conversation

ffcdf commented Mar 1, 2024

Changes proposed in this pull request

Other links

Uh oh!

ffcdf commented Mar 1, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Ndiritu commented Mar 4, 2024

Uh oh!

Ndiritu commented Mar 4, 2024

Uh oh!

baywet commented Mar 4, 2024

Uh oh!

Ndiritu commented Mar 4, 2024

Uh oh!

baywet commented Mar 4, 2024

Uh oh!

baywet commented Mar 4, 2024

Uh oh!

Ndiritu commented Mar 4, 2024

Uh oh!

calebkiage commented Mar 5, 2024

Uh oh!

ffcdf commented Mar 5, 2024

Uh oh!

Ndiritu commented Mar 5, 2024

Uh oh!

calebkiage commented Mar 5, 2024

Uh oh!

Ndiritu commented Mar 19, 2024

Uh oh!

ihudedi commented Apr 4, 2024

Uh oh!

baywet commented Apr 16, 2024

Uh oh!

Ndiritu commented Apr 18, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants