WebGLRenderer: Merge update ranges before issuing updates to the GPU. #29189
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
TLDR: this PR achieves up to 3 orders of magnitude performance improvement when updating a large number of adjacent ranges within
InstancedBufferAttributewhich is a common use case for projects heavily leveraging instancing.Description
BufferAttribute#addUpdateRange can be used with
needsUpdateso that three only transfers subsections of data to the GPU. This is a powerful feature which allows clients to better manage CPU<>GPU bandwidth. For example, in cases where a BufferAttribute may be several MB large and only a few bytes change per frame, clients can transfer only the changed bytes instead of the entire buffer.In our product we've seen large improvement gains using update ranges, but frame drops in cases where many update ranges are present in a single frame. This can easily be observed with InstancedBufferAttribute. In a project which heavily leverages
InstancedMeshand thereforeInstancedBufferAttributeto represent instance data, it's commonly required that individual instances are updated usingaddUpdateRange. In a frame where all instances need to be updated, this can create a large number of update ranges which are nearly all adjacent. As a result we observe a large number of avoidablegl.bufferSubDatacalls and frame drops (I imagine due to GPU command overhead).This PR automatically merges overlapping / adjacent update ranges before calling
gl.bufferSubDataand results in up to a 99.78% wall time reduction rendering our project (see below for details)Impact
In a toy example within our company, I created a scene with 10k plane geometries (via
InstancedMesh) which are positioned by vec3's interleaved viaInstancedBufferAttribute. Updating all 10k positions in a single frame on a 2021 M1 Macbook Pro takes 112.21ms in three.js today, when run on this this PR it takes 0.25ms instead.Design Notes
addUpdateRangewhere we could amortize the merging costs because clients are allowed to directly manipulate theupdateRangesarray. Adding this logic to the renderers ensures robustness regardless of how clients interact with update ranges.WebGLAttributesmaking it challenging to envision how we'd mockglwhen instantiatingWebGLAttributes. If there's a suggestion here, I'd love to hear it.This contribution is funded by SOOT