Skip to content

Conversation

@RenaudRohlinger
Copy link
Collaborator

Description
Currently, if two InstancedMesh objects have the same count, the second InstancedMesh incorrectly inherits the matrices of the first due to a shared cacheKey. This happens because the cacheKey is solely based on object.count.

Fix:
By incorporating the uuid of the InstancedMesh into the cacheKey, we ensure that each InstancedMesh maintains its own unique cache, preventing the unintended sharing of matrices.

Example for test if needed:
instance_webgpu_issue.glb.zip

This contribution is funded by Utsubo

@RenaudRohlinger RenaudRohlinger requested a review from sunag August 5, 2024 14:22
@github-actions
Copy link

github-actions bot commented Aug 5, 2024

📦 Bundle size

Full ESM build, minified and gzipped.

Filesize dev Filesize PR Diff
685.1 kB (169.6 kB) 685.1 kB (169.6 kB) +0 B

🌳 Bundle size after tree-shaking

Minimal build including a renderer, camera, empty scene, and dependencies.

Filesize dev Filesize PR Diff
462 kB (111.5 kB) 462 kB (111.5 kB) +0 B

@RenaudRohlinger RenaudRohlinger marked this pull request as draft August 5, 2024 14:32
@RenaudRohlinger RenaudRohlinger marked this pull request as ready for review August 5, 2024 14:45
@sunag sunag merged commit 6697bd1 into mrdoob:dev Aug 5, 2024
@sunag
Copy link
Collaborator

sunag commented Aug 5, 2024

I think the next step would be to use and improve referenceBuffer() for this instead of adding the buffer()/instancedBufferAttribute() directly as is done in InstanceNode. Then we could share the same NodeBuilderState.

if ( object.count > 1 ) {

cacheKey += object.count + ',';
cacheKey += object.count + ',' + object.uuid + ',';
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is object.count still needed in the cache key when we are adding uuid?

@sunag sunag added this to the r168 milestone Aug 5, 2024
@RenaudRohlinger
Copy link
Collaborator Author

RenaudRohlinger commented Aug 6, 2024

I think the next step would be to use and improve referenceBuffer()

On that note, I recently conducted extensive benchmarking on the WebGPURenderer. The performance significantly lags behind the WebGLRenderer mainly due to syntax such as this.node.value = value; in updateValue() of ReferenceNode.

For example, a slightly complex GLTF scene running with WebGPU takes about 10ms per frame to render:
Screenshot 2024-08-06 at 13 09 56

More worryingly, it runs at 30fps with 6-7 frames dropped per loop in the WebGLBackend:
Screenshot 2024-08-06 at 13 48 40

In contrast, the exact same setup with the WebGLRenderer takes about 3.5ms per frame (almost 3 times faster than WebGPU) and runs at a steady 144fps+:
Screenshot 2024-08-06 at 13 10 56

So far, the culprit appears to be anything related to the use of custom setters/getters, such as this.node.value = val or return this.node.value in NodeUniform or similar cases:
Screenshot 2024-08-06 at 13 09 14

I haven't opened an issue yet, as I want to fully explain the current problem and propose a solution. I'm openly brainstorming here, but a GLTF scene with 300-400+ meshes runs 2-3 times slower in both backends compared to the WebGLRenderer.

I don't think it's a major issue, just a necessary cleanup to catch up after building so much new content recently! /cc @sunag @Mugen87

@Mugen87
Copy link
Collaborator

Mugen87 commented Aug 6, 2024

I think it would be helpful to have some kind of webgpu_performance_* demo that allows us to measure the performance when rendering scenes with complex object hierarchies and many draw calls. Ideally we have the same example for WebGLRenderer so it's easier to compare both renderers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants