[SYCL] optimize createSyclObjFromImpl calls to take rvalue-ref to shared_ptr #20859

lslusarczyk · 2025-12-09T14:07:04Z

The optimization results in moving shared_pointer inside createSyclObjFromImpl instead of copying and thanks to it we save two atomic operations (see e.g. this SO thread).

I've applied it to all possible places in the code, leaving only these where copying is indeed needed (mostly for context_impl use).

Results summary

overhead over UR reduced by ~8% in scenarios using events. Other benchmarks also show visible improvements in many cases, including new pytorch multiqueue benchmarks which improved overall by 2.7%

Results Examples

The new result is expressed by dots on the right sides of plots.

SubmitKernel out of order using events long kernel, CPU count(1)

old = 134.6, new = 132.8, UR baseline = 113, overhead over UR reduced by 8.3%

SubmitKernel out of order with completion using events, CPU count(4)

old = 140, new = 138.2, UR baseline = 118.1, overhead over UR reduced by 8.1%

old = 122.3, new = 121.3, UR baseline = 108.1, overhead over UR reduced by 7.0%

old time = 13.91, new time = 13.58, whole stack reduced by 2.4%

And finally new pytorch microbenchmarks:

old time = 1.81, new time = 1.76, L0 baseline = 1.44
whole stack reduced by 2.8%, overhead over L0 reduced by 13.5%

…red_ptr if possible

[SYCL] optimize createSyclObjFromImpl calls to take rvalue-ref to sha…

59fcd06

…red_ptr if possible

lslusarczyk temporarily deployed to WindowsCILock December 9, 2025 14:07 — with GitHub Actions Inactive

lslusarczyk had a problem deploying to WindowsCILock December 9, 2025 14:36 — with GitHub Actions Failure

lslusarczyk temporarily deployed to WindowsCILock December 9, 2025 14:36 — with GitHub Actions Inactive

lslusarczyk marked this pull request as ready for review December 9, 2025 15:21

lslusarczyk requested a review from a team as a code owner December 9, 2025 15:21

lslusarczyk requested a review from againull December 9, 2025 15:21

lslusarczyk temporarily deployed to WindowsCILock December 9, 2025 15:39 — with GitHub Actions Inactive

sergey-semenov approved these changes Dec 9, 2025

View reviewed changes

againull approved these changes Dec 9, 2025

View reviewed changes

againull merged commit 4a245b8 into intel:sycl Dec 9, 2025
102 of 107 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SYCL] optimize createSyclObjFromImpl calls to take rvalue-ref to shared_ptr #20859

[SYCL] optimize createSyclObjFromImpl calls to take rvalue-ref to shared_ptr #20859

Uh oh!

lslusarczyk commented Dec 9, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[SYCL] optimize createSyclObjFromImpl calls to take rvalue-ref to shared_ptr #20859

[SYCL] optimize createSyclObjFromImpl calls to take rvalue-ref to shared_ptr #20859

Uh oh!

Conversation

lslusarczyk commented Dec 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Results summary

Results Examples

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

lslusarczyk commented Dec 9, 2025 •

edited

Loading