Skip to content
Merged
Show file tree
Hide file tree
Changes from 47 commits
Commits
Show all changes
68 commits
Select commit Hold shift + click to select a range
0601210
[SYCL] Mark mem object which may have not blocking dtor according to …
KseniyaTikhomirova Sep 16, 2022
aff3be6
Add draft how to delay buffer_impl release
KseniyaTikhomirova Sep 21, 2022
1195b59
Update symbols for non-breaking change
KseniyaTikhomirova Sep 21, 2022
8d05802
Update abi test vtable.cpp - non-breaking change
KseniyaTikhomirova Sep 21, 2022
b54b8e4
Update SYCL_MINOR_VERSION for non-breaking ABI change
KseniyaTikhomirova Sep 21, 2022
965a015
Remove ABI break
KseniyaTikhomirova Sep 21, 2022
27ccbff
Update symbols to new version
KseniyaTikhomirova Sep 21, 2022
9540fe0
Tiny rename
KseniyaTikhomirova Sep 21, 2022
c00c7cb
Revert "Update abi test vtable.cpp - non-breaking change"
KseniyaTikhomirova Sep 21, 2022
661dace
Remove isDefault method, reimplemented
KseniyaTikhomirova Sep 21, 2022
6615db3
Fix symbols again
KseniyaTikhomirova Sep 21, 2022
8174dc3
Add handling of deferred mem objects release
KseniyaTikhomirova Sep 22, 2022
d55405e
Remove unused function and restore XPTI traces collection
KseniyaTikhomirova Sep 22, 2022
bb2c4fb
Add skeleton for unit test
KseniyaTikhomirova Sep 22, 2022
5db9e85
Fix shared_ptr use_count check
KseniyaTikhomirova Sep 23, 2022
53a1892
Test draft
KseniyaTikhomirova Sep 23, 2022
4b0a3fa
[SYCL] Align usm_allocator ctor and operators with SYCCL2020
KseniyaTikhomirova Sep 23, 2022
8daea20
Update attach scheduler logic
KseniyaTikhomirova Sep 26, 2022
c855f13
Make cleanup iterative
KseniyaTikhomirova Sep 28, 2022
8dbcd1c
Fix test utils impl error
KseniyaTikhomirova Sep 28, 2022
0f61c64
Add other tests for buffer contructors
KseniyaTikhomirova Sep 28, 2022
ddf215b
Other tests for high level buffer destruction deferring logic
KseniyaTikhomirova Sep 29, 2022
23bea82
Add unittest for waitForRecordToFinish
KseniyaTikhomirova Oct 3, 2022
aa41d76
Remove debug flags uploaded by mistake
KseniyaTikhomirova Oct 3, 2022
e296d03
Merge branch 'sycl' into buff_detach
KseniyaTikhomirova Oct 3, 2022
179c472
Fix clang-format
KseniyaTikhomirova Oct 3, 2022
2076c7c
Update test to not keep ill-formed objects
KseniyaTikhomirova Oct 4, 2022
e911d0a
Check command destruction
KseniyaTikhomirova Oct 4, 2022
81c2b09
Fix clang-format
KseniyaTikhomirova Oct 4, 2022
edcfcfc
Handle set_final_data usage
KseniyaTikhomirova Oct 4, 2022
b5e85de
Fix code-review comments (round 1)
KseniyaTikhomirova Oct 5, 2022
a5980a0
Fix missed comments
KseniyaTikhomirova Oct 5, 2022
ac06f1b
Merge branch 'sycl' into buff_detach
KseniyaTikhomirova Oct 5, 2022
09b8359
Remove nagation from variable name and logic
KseniyaTikhomirova Oct 5, 2022
1e75448
Simplify deferred mem objects release - do not aggregate to capture lock
KseniyaTikhomirova Oct 5, 2022
484b1cf
Return trace of stream buffer emptyness to scheduler destructor
KseniyaTikhomirova Oct 5, 2022
1061322
Fix comments (round 2)
KseniyaTikhomirova Oct 7, 2022
d4537a3
Fix comments (round 3)
KseniyaTikhomirova Oct 7, 2022
342ff91
Fix build
KseniyaTikhomirova Oct 7, 2022
fdab0e7
Fix comments & tests (round 4)
KseniyaTikhomirova Oct 7, 2022
0872d7c
Predict comments: restore removeMemoryObject content
KseniyaTikhomirova Oct 7, 2022
3a25f1e
Fix root cause of hang when host task is not even started upon releas…
KseniyaTikhomirova Oct 11, 2022
28f008d
Fix comments (round n)
KseniyaTikhomirova Oct 14, 2022
92b5e15
Merge branch 'sycl' into buff_detach
KseniyaTikhomirova Oct 17, 2022
6247f8a
[ESIMD] Implement piEventGetInfo for event execution status
KseniyaTikhomirova Oct 18, 2022
4133862
Move comment to the right place
KseniyaTikhomirova Oct 18, 2022
79b2125
cv.notify_all should not be called under mutex paired with cv
KseniyaTikhomirova Oct 19, 2022
868973c
Remove default allocator check after SYCL2020 update
KseniyaTikhomirova Oct 21, 2022
1bc8e57
Update unittests due to default allocator check removal
KseniyaTikhomirova Oct 21, 2022
6e0943b
Update symbols after parameter removal
KseniyaTikhomirova Oct 21, 2022
1c62d08
Try to align hip context destruction handling with cuda WA
KseniyaTikhomirova Oct 26, 2022
30dfaf2
Merge branch 'sycl' into buff_detach
KseniyaTikhomirova Oct 27, 2022
60e3011
Fix unit test after mock plugin rework
KseniyaTikhomirova Nov 7, 2022
6964876
Merge branch 'sycl' into buff_detach
KseniyaTikhomirova Nov 16, 2022
3d5315e
DRAFT: try to release scheduler resources earlier using thread_local …
KseniyaTikhomirova Nov 16, 2022
467a9ea
Draft: try to release scheduler resources earlier, fix counter declar…
KseniyaTikhomirova Nov 17, 2022
0b9032a
Release scheduler resources earlier
KseniyaTikhomirova Nov 16, 2022
9d570ce
change location for buff release attempt
KseniyaTikhomirova Dec 5, 2022
c6d5dc7
Code cleanup
KseniyaTikhomirova Dec 6, 2022
a0b37ef
Code cleanup Part 2
KseniyaTikhomirova Dec 6, 2022
619ee4e
Revert "Try to align hip context destruction handling with cuda WA"
KseniyaTikhomirova Dec 6, 2022
3187f0a
Return cleanup deferred buffers to cleanupCommands call
KseniyaTikhomirova Dec 6, 2022
dbe88e2
Remove unnecessary variable in ObjectRefCounter
KseniyaTikhomirova Dec 6, 2022
06e2608
Fix hang
KseniyaTikhomirova Dec 6, 2022
71e9048
Fix comments
KseniyaTikhomirova Dec 6, 2022
a89e577
Merge branch 'sycl' into buff_detach
KseniyaTikhomirova Dec 6, 2022
ceea7f8
Prevent warning as error for release build
KseniyaTikhomirova Dec 7, 2022
1f201a9
wprotectMDeferredMemObjRelease modification with mutex
KseniyaTikhomirova Dec 7, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 9 additions & 1 deletion sycl/include/sycl/buffer.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -117,6 +117,8 @@ class __SYCL_EXPORT buffer_plain {

size_t getSize() const;

void handleRelease(bool DefaultAllocator) const;

std::shared_ptr<detail::buffer_impl> impl;
};

Expand Down Expand Up @@ -457,7 +459,13 @@ class buffer : public detail::buffer_plain {

buffer &operator=(buffer &&rhs) = default;

~buffer() = default;
~buffer() {
buffer_plain::
handleRelease(/*DefaultAllocator = */
std::is_same<
AllocatorT,
detail::sycl_memory_object_allocator<T>>::value);
}

bool operator==(const buffer &rhs) const { return impl == rhs.impl; }

Expand Down
36 changes: 34 additions & 2 deletions sycl/plugins/esimd_emulator/pi_esimd_emulator.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1379,8 +1379,40 @@ pi_result piKernelRelease(pi_kernel) { DIE_NO_IMPLEMENTATION; }

pi_result piEventCreate(pi_context, pi_event *) { DIE_NO_IMPLEMENTATION; }

pi_result piEventGetInfo(pi_event, pi_event_info, size_t, void *, size_t *) {
DIE_NO_IMPLEMENTATION;
pi_result piEventGetInfo(pi_event Event, pi_event_info ParamName,
size_t ParamValueSize, void *ParamValue,
size_t *ParamValueSizeRet) {
if (ParamName != PI_EVENT_INFO_COMMAND_EXECUTION_STATUS) {
DIE_NO_IMPLEMENTATION;
}

auto CheckAndFillStatus = [&](const cm_support::CM_STATUS &State) {
pi_int32 Result = PI_EVENT_RUNNING;
Copy link
Contributor

@kbobrovs kbobrovs Oct 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this an appropriate default?
@dongkyunahn-intel, your help is needed to review this part. I believe the Result initialization should be a switch on State.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It think it would be better to set Result with switch/case statement as Konst suggested. CM_STATUS_* are defined in link below.

https://github.com/intel/cm-cpu-emulation/blob/0c5fc287f34ae38d3184ab70ea5513d9fb1ff338/common/type_status.h#L13

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @kbobrovs and @dongkyunahn-intel I did only two states intentionally, because other statuses (!= CM_STATUS_FINISHED) seems logical to map to running since it is all "work in progress" and RT does not need more details. I also used L0 plugin as reference which also has differentiation for two states - finished and not finished = running. I think it would be better to align plugins and not introduce extra value handling on level above. What do you think?
Although it may be needed to report CM_STATUS_RESET separately as error. Could you please educate me what does this status stands for?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @kbobrovs and @dongkyunahn-intel I did only two states intentionally, because other statuses (!= CM_STATUS_FINISHED) seems logical to map to running since it is all "work in progress" and RT does not need more details. I also used L0 plugin as reference which also has differentiation for two states - finished and not finished = running. I think it would be better to align plugins and not introduce extra value handling on level above. What do you think?

I was not aware that L0 has such implementation. Meanwhile, I looked into CM_EMU's GetStatus() implementation and found out that the function is dummy one returning only CM_STATUS_FINISHED. Now it looks like okay to maintain current implementation - with comments about FINISHED/NOT_FINISHED.

Although it may be needed to report CM_STATUS_RESET separately as error. Could you please educate me what does this status stands for?

'CM_STATUS_RESET' is defined, but never used in CM_EMU. You can disregard it.

if (State == cm_support::CM_STATUS_FINISHED)
Result = PI_EVENT_COMPLETE;
if (ParamValue) {
if (ParamValueSize < sizeof(Result))
return PI_ERROR_INVALID_VALUE;
*static_cast<pi_int32 *>(ParamValue) = Result;
}
if (ParamValueSizeRet) {
*ParamValueSizeRet = sizeof(Result);
}
return PI_SUCCESS;
};
// Dummy event is already completed ones done by CM.
if (Event->IsDummyEvent)
return CheckAndFillStatus(cm_support::CM_STATUS_FINISHED);

if (Event->CmEventPtr == nullptr)
return PI_ERROR_INVALID_EVENT;

cm_support::CM_STATUS Status;
int32_t Result = Event->CmEventPtr->GetStatus(Status);
if (Result != cm_support::CM_SUCCESS)
return PI_ERROR_COMMAND_EXECUTION_FAILURE;

return CheckAndFillStatus(Status);
}

pi_result piEventGetProfilingInfo(pi_event Event, pi_profiling_info ParamName,
Expand Down
7 changes: 7 additions & 0 deletions sycl/source/buffer.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -121,6 +121,13 @@ void buffer_plain::addOrReplaceAccessorProperties(

size_t buffer_plain::getSize() const { return impl->getSizeInBytes(); }

void buffer_plain::handleRelease(bool DefaultAllocator) const {
// Try to detach memory object only if impl is going to be released.
// Buffer copy will have pointer to the same impl.
if (impl.use_count() == 1)
impl->detachMemoryObject(impl, DefaultAllocator);
}

} // namespace detail
} // __SYCL_INLINE_VER_NAMESPACE(_V1)
} // namespace sycl
19 changes: 13 additions & 6 deletions sycl/source/detail/event_impl.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -78,17 +78,19 @@ void event_impl::waitInternal() {

void event_impl::setComplete() {
if (MHostEvent || !MEvent) {
std::unique_lock<std::mutex> lock(MMutex);
{
std::unique_lock<std::mutex> lock(MMutex);
#ifndef NDEBUG
int Expected = HES_NotComplete;
int Desired = HES_Complete;
int Expected = HES_NotComplete;
int Desired = HES_Complete;

bool Succeeded = MState.compare_exchange_strong(Expected, Desired);
bool Succeeded = MState.compare_exchange_strong(Expected, Desired);

assert(Succeeded && "Unexpected state of event");
assert(Succeeded && "Unexpected state of event");
#else
MState.store(static_cast<int>(HES_Complete));
MState.store(static_cast<int>(HES_Complete));
#endif
}
cv.notify_all();
return;
}
Expand Down Expand Up @@ -443,6 +445,11 @@ void event_impl::cleanDepEventsThroughOneLevel() {
}
}

bool event_impl::isCompleted() {
return get_info<info::event::command_execution_status>() ==
info::event_command_status::complete;
}

} // namespace detail
} // __SYCL_INLINE_VER_NAMESPACE(_V1)
} // namespace sycl
2 changes: 2 additions & 0 deletions sycl/source/detail/event_impl.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -234,6 +234,8 @@ class event_impl {
/// state.
bool isInitialized() const noexcept { return MIsInitialized; }

bool isCompleted();

private:
// When instrumentation is enabled emits trace event for event wait begin and
// returns the telemetry event generated for the wait
Expand Down
10 changes: 10 additions & 0 deletions sycl/source/detail/global_handler.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,14 @@ T &GlobalHandler::getOrCreate(InstWithLock<T> &IWL, Types... Args) {
return *IWL.Inst;
}

void GlobalHandler::attachScheduler(Scheduler *Scheduler) {
// The method is for testing purposes. Do not protect with lock since
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// The method is for testing purposes. Do not protect with lock since
// The method is used in unittests only. Do not protect with lock since

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:-) fixed in 71e9048

// releaseResources will cause dead lock due to host queue release
if (MScheduler.Inst)
MScheduler.Inst->releaseResources();
MScheduler.Inst.reset(Scheduler);
}

Scheduler &GlobalHandler::getScheduler() { return getOrCreate(MScheduler); }

ProgramManager &GlobalHandler::getProgramManager() {
Expand Down Expand Up @@ -142,6 +150,8 @@ void GlobalHandler::unloadPlugins() {
}

void shutdown() {
if (GlobalHandler::instance().MScheduler.Inst)
GlobalHandler::instance().MScheduler.Inst->releaseResources();
// Ensure neither host task is working so that no default context is accessed
// upon its release
if (GlobalHandler::instance().MHostTaskThreadPool.Inst)
Expand Down
3 changes: 3 additions & 0 deletions sycl/source/detail/global_handler.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,9 @@ class GlobalHandler {

void unloadPlugins();

// For testing purposes only
void attachScheduler(Scheduler *Scheduler);

private:
friend void releaseDefaultContexts();
friend void shutdown();
Expand Down
97 changes: 87 additions & 10 deletions sycl/source/detail/scheduler/scheduler.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,18 @@ namespace sycl {
__SYCL_INLINE_VER_NAMESPACE(_V1) {
namespace detail {

bool Scheduler::checkLeavesCompletion(MemObjRecord *Record) {
for (Command *Cmd : Record->MReadLeaves) {
if (!Cmd->getEvent()->isCompleted())
return false;
}
for (Command *Cmd : Record->MWriteLeaves) {
if (!Cmd->getEvent()->isCompleted())
return false;
}
return true;
}

void Scheduler::waitForRecordToFinish(MemObjRecord *Record,
ReadLockT &GraphReadLock) {
#ifdef XPTI_ENABLE_INSTRUMENTATION
Expand Down Expand Up @@ -271,21 +283,17 @@ void Scheduler::removeMemoryObject(detail::SYCLMemObjI *MemObj) {
std::vector<std::shared_ptr<const void>> AuxResourcesToDeallocate;

{
MemObjRecord *Record = nullptr;
MemObjRecord *Record = MGraphBuilder.getMemObjRecord(MemObj);
if (!Record)
// No operations were performed on the mem object
return;

{
// This only needs a shared mutex as it only involves enqueueing and
// awaiting for events
ReadLockT Lock(MGraphLock);

Record = MGraphBuilder.getMemObjRecord(MemObj);
if (!Record)
// No operations were performed on the mem object
return;

waitForRecordToFinish(Record, Lock);
}

{
WriteLockT Lock(MGraphLock, std::defer_lock);
acquireWriteLock(Lock);
Expand Down Expand Up @@ -410,10 +418,31 @@ Scheduler::~Scheduler() {
"not all resources were released. Please be sure that all kernels "
"have synchronization points.\n\n");
}
// Please be aware that releaseResources should be called before deletion of
// Scheduler. Otherwise there can be the case when objects Scheduler keeps as
// fields may need Scheduler for their release and they work with Scheduler
// via GlobalHandler::getScheduler that will create new Scheduler object.
// Still keep it here but it should do almost nothing if releaseResources
// called before.
releaseResources();
}

void Scheduler::releaseResources() {
// There might be some commands scheduled for post enqueue cleanup that
// haven't been freed because of the graph mutex being locked at the time,
// clean them up now.
cleanupCommands({});
DefaultHostQueue.reset();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to be unsafe to release DefaultHostQueue here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets first get results if the solution helps


// We need loop since sometimes we may need new objects to be added to
// deferred mem objects storage during cleanup. Known example is: we cleanup
// existing deferred mem objects under write lock, during this process we
// cleanup commands related to this record, command may have last reference to
// queue_impl, ~queue_impl is called and buffer for assert (which is created
// with size only so all confitions for deferred release are satisfied) is
// added to deferred mem obj storage. So we may end up with leak.
while (!isDeferredMemObjectsEmpty())
cleanupDeferredMemObjects(BlockingT::BLOCKING);
}

void Scheduler::acquireWriteLock(WriteLockT &Lock) {
Expand Down Expand Up @@ -442,8 +471,8 @@ MemObjRecord *Scheduler::getMemObjRecord(const Requirement *const Req) {
}

void Scheduler::cleanupCommands(const std::vector<Command *> &Cmds) {
if (Cmds.empty())
{
cleanupDeferredMemObjects(BlockingT::NON_BLOCKING);
if (Cmds.empty()) {
std::lock_guard<std::mutex> Lock{MDeferredCleanupMutex};
if (MDeferredCleanupCommands.empty())
return;
Expand Down Expand Up @@ -472,6 +501,54 @@ void Scheduler::cleanupCommands(const std::vector<Command *> &Cmds) {
}
}

void Scheduler::deferMemObjRelease(const std::shared_ptr<SYCLMemObjI> &MemObj) {
{
std::lock_guard<std::mutex> Lock{MDeferredMemReleaseMutex};
MDeferredMemObjRelease.push_back(MemObj);
}
cleanupDeferredMemObjects(BlockingT::NON_BLOCKING);
}

inline bool Scheduler::isDeferredMemObjectsEmpty() {
std::lock_guard<std::mutex> Lock{MDeferredMemReleaseMutex};
return MDeferredMemObjRelease.empty();
}

void Scheduler::cleanupDeferredMemObjects(BlockingT Blocking) {
if (isDeferredMemObjectsEmpty())
return;
if (Blocking == BlockingT::BLOCKING) {
std::vector<std::shared_ptr<SYCLMemObjI>> TempStorage;
{
std::lock_guard<std::mutex> LockDef{MDeferredMemReleaseMutex};
MDeferredMemObjRelease.swap(TempStorage);
}
// if any objects in TempStorage exist - it is leaving scope and being
// deleted
}

std::vector<std::shared_ptr<SYCLMemObjI>> ObjsReadyToRelease;
{

ReadLockT Lock = ReadLockT(MGraphLock, std::try_to_lock);
if (Lock.owns_lock()) {
// Not expected that Blocking == true will be used in parallel with
// adding MemObj to storage, no such scenario.
std::lock_guard<std::mutex> LockDef{MDeferredMemReleaseMutex};
auto MemObjIt = MDeferredMemObjRelease.begin();
while (MemObjIt != MDeferredMemObjRelease.end()) {
MemObjRecord *Record = MGraphBuilder.getMemObjRecord((*MemObjIt).get());
if (!checkLeavesCompletion(Record)) {
MemObjIt++;
continue;
}
ObjsReadyToRelease.push_back(*MemObjIt);
MemObjIt = MDeferredMemObjRelease.erase(MemObjIt);
}
}
}
// if any ObjsReadyToRelease found - it is leaving scope and being deleted
}
} // namespace detail
} // __SYCL_INLINE_VER_NAMESPACE(_V1)
} // namespace sycl
11 changes: 11 additions & 0 deletions sycl/source/detail/scheduler/scheduler.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -442,9 +442,13 @@ class Scheduler {
QueueImplPtr getDefaultHostQueue() { return DefaultHostQueue; }

static MemObjRecord *getMemObjRecord(const Requirement *const Req);
// Virtual for testing purposes only
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Virtual for testing purposes only
// Virtual for testing purposes only

Is it still relevant?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nope, fixed in 71e9048

void deferMemObjRelease(const std::shared_ptr<detail::SYCLMemObjI> &MemObj);

Scheduler();
~Scheduler();
void releaseResources();
inline bool isDeferredMemObjectsEmpty();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please clarify why inline is needed here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in 71e9048
you are right, compiler will decide


protected:
// TODO: after switching to C++17, change std::shared_timed_mutex to
Expand All @@ -464,6 +468,9 @@ class Scheduler {
static void enqueueLeavesOfReqUnlocked(const Requirement *const Req,
std::vector<Command *> &ToCleanUp);

// May lock graph with read and write modes during execution.
void cleanupDeferredMemObjects(BlockingT Blocking);

/// Graph builder class.
///
/// The graph builder provides means to change an existing graph (e.g. add
Expand Down Expand Up @@ -764,13 +771,17 @@ class Scheduler {
/// GraphReadLock will be unlocked/locked as needed. Upon return from the
/// function, GraphReadLock will be left in locked state.
void waitForRecordToFinish(MemObjRecord *Record, ReadLockT &GraphReadLock);
bool checkLeavesCompletion(MemObjRecord *Record);

GraphBuilder MGraphBuilder;
RWLockT MGraphLock;

std::vector<Command *> MDeferredCleanupCommands;
std::mutex MDeferredCleanupMutex;

std::vector<std::shared_ptr<SYCLMemObjI>> MDeferredMemObjRelease;
std::mutex MDeferredMemReleaseMutex;

QueueImplPtr DefaultHostQueue;

friend class Command;
Expand Down
14 changes: 13 additions & 1 deletion sycl/source/detail/sycl_mem_obj_t.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ SYCLMemObjT::SYCLMemObjT(pi_native_handle MemObject, const context &SyclContext,
MInteropContext(detail::getSyclObjImpl(SyclContext)),
MOpenCLInterop(true), MHostPtrReadOnly(false), MNeedWriteBack(true),
MUserPtr(nullptr), MShadowCopy(nullptr), MUploadDataFunctor(nullptr),
MSharedPtrStorage(nullptr) {
MSharedPtrStorage(nullptr), MHostPtrProvided(true) {
if (MInteropContext->is_host())
throw sycl::invalid_parameter_error(
"Creation of interoperability memory object using host context is "
Expand Down Expand Up @@ -147,6 +147,18 @@ void SYCLMemObjT::determineHostPtr(const ContextImplPtr &Context,
} else
HostPtrReadOnly = false;
}

void SYCLMemObjT::detachMemoryObject(const std::shared_ptr<SYCLMemObjT> &Self,
bool DefaultAllocator) const {
// Check MRecord without read lock because at this point we expect that no
// commands that operate on the buffer can be created. MRecord is nullptr on
// buffer creation and set to meaningfull
// value only if any operation on buffer submitted inside addCG call. addCG is
// called from queue::submit and buffer destruction could not overlap with it.
if (MRecord && !MHostPtrProvided && DefaultAllocator)
Scheduler::getInstance().deferMemObjRelease(Self);
}

} // namespace detail
} // __SYCL_INLINE_VER_NAMESPACE(_V1)
} // namespace sycl
Loading