[WebGPU] Fix wait logic for inflight jobs#20096
[WebGPU] Fix wait logic for inflight jobs#20096reeselevine merged 10 commits intoggml-org:masterfrom
Conversation
…and fix wait to delete only the future that is completed.
| } | ||
| } | ||
| } else { | ||
| // Poll once and return |
There was a problem hiding this comment.
Is this the intended behavior when block = false btw? Since I think calling WaitAny with a timeout of 0 just checks once and then returns.
There was a problem hiding this comment.
yep, the idea is to just check when block=false, in case the implementation isn't good at scheduling callbacks on its own.
| ctx->instance.WaitAny(futures[0].futures.size(), futures[0].futures.data(), UINT64_MAX); | ||
| futures.erase(futures.begin()); |
There was a problem hiding this comment.
This would previously wait for any future within the first vector of futures to finish, and then delete the entire first vector (instead of just the completed future) since futures is a vector of vectors of futures. I think this bug surfaced with the param buf diff because by expanding the parameter buffer, we can have multiple futures in flight instead of just 1, so we may delete an inflight future alongside a completed future. If something was waiting for a deleted future, it would then wait forever, causing test-thread-safety to time out.
There was a problem hiding this comment.
this condition isn't doing quite what we want anymore, since there is now no separation between param_bufs/set_row_error_bufs/gpu_profile bufs from different batch submissions. But, I have a PR coming up soon which should simplify this further and I think I can split it out into making sure we free enough param bufs for future batches. So this is fine to merge for now.
| ctx->instance.WaitAny(futures[0].futures.size(), futures[0].futures.data(), UINT64_MAX); | ||
| futures.erase(futures.begin()); |
There was a problem hiding this comment.
this condition isn't doing quite what we want anymore, since there is now no separation between param_bufs/set_row_error_bufs/gpu_profile bufs from different batch submissions. But, I have a PR coming up soon which should simplify this further and I think I can split it out into making sure we free enough param bufs for future batches. So this is fine to merge for now.
| } | ||
| } | ||
| } else { | ||
| // Poll once and return |
There was a problem hiding this comment.
yep, the idea is to just check when block=false, in case the implementation isn't good at scheduling callbacks on its own.
* Enable tmate debugging for investigating thread safety issue * Refactor wait and submit to operate on vector<wgpu::FutureWaitInfo>, and fix wait to delete only the future that is completed. * Cleanup * Remove clear change and run clang-format * Cleanup
Fix WebGPU wait logic incorrectly removing futures. WaitAny returns when any future completes, but the previous implementation erased the entire submission entry (aka a vector of futures). Flatten the nested futures structure to a single vector and remove only the futures that are completed.