-
Notifications
You must be signed in to change notification settings - Fork 809
[SYCL] [DOC] Prepare design-document for assert feature #3461
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 8 commits
2911ea7
b69a1cd
15ea88e
ca08fec
1f8d9a9
2ee590c
77699a2
001a573
32b6479
b8637c2
b0cd85f
8c03648
121c945
13b40fd
a4b4884
c06db5f
823124a
a99368b
78d7fcb
6882e95
32663e0
2b84a83
423107b
7611511
a31b808
257054a
3f50173
c1326aa
5095b1a
5078fcc
4dc7b1f
9bcac02
7ec3ac8
8cbfde7
cc085f5
8835bf8
8835756
ecb8659
07debdb
995e4d8
b57ac48
d2f13ff
6281bc5
a5461f3
32a32f4
641d071
dc058a9
16fd8f0
fbca768
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| @@ -0,0 +1,250 @@ | ||||||||||
| # Assert feature | ||||||||||
|
|
||||||||||
| **IMPORTANT**: This document is a draft. | ||||||||||
|
|
||||||||||
| During debugging of kernel code user may put assertions here and there. | ||||||||||
| The expected behaviour of assertion failure at host is application abort. | ||||||||||
| Our choice for device-side assertions is asynchronous exception in order to | ||||||||||
| allow for extensibility. | ||||||||||
|
|
||||||||||
| The user is free to disable assertions by defining `NDEBUG` macro at | ||||||||||
| compile-time. | ||||||||||
|
|
||||||||||
|
|
||||||||||
| ## Use-case example | ||||||||||
|
|
||||||||||
| ``` | ||||||||||
| using namespace cl::sycl; | ||||||||||
s-kanaev marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||||||
| auto ErrorHandler = [] (exception_list Exs) { | ||||||||||
| for (exception_ptr const& E : Exs) { | ||||||||||
s-kanaev marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||||||
| try { | ||||||||||
| std::rethrow_exception(E); | ||||||||||
| } | ||||||||||
| catch (event_error const& Ex) { | ||||||||||
s-kanaev marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||||||
| std::cout << “Exception - ” << Ex.what(); // assertion failed | ||||||||||
| std::abort(); | ||||||||||
s-kanaev marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||||||
| } | ||||||||||
| } | ||||||||||
| }; | ||||||||||
|
|
||||||||||
| void user_func(item<2> Item) { | ||||||||||
| assert((Item[0] % 2) && “Nil”); | ||||||||||
| } | ||||||||||
|
|
||||||||||
| int main() { | ||||||||||
| queue Q(ErrorHandler); | ||||||||||
| q.submit([&] (handler& CGH) { | ||||||||||
| CGH.parallel_for<class TheKernel>(range<2>{N, M}, [=](item<2> It) { | ||||||||||
| do_smth(); | ||||||||||
| user_func(It); | ||||||||||
| do_smth_else(); | ||||||||||
| }); | ||||||||||
| }); | ||||||||||
| Q.wait_and_throw(); | ||||||||||
| std::cout << “One shouldn’t see this message.“; | ||||||||||
| return 0; | ||||||||||
| } | ||||||||||
| ``` | ||||||||||
|
|
||||||||||
| In this use-case every work-item with even X dimension will trigger assertion | ||||||||||
s-kanaev marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||||||
| failure. Assertion failure should be reported via asynchronous exceptions. If | ||||||||||
| asynchronous exception handler is set the failure is reported with | ||||||||||
| `cl::sycl::event_error` exception. Otherwise, SYCL Runtime should trigger abort. | ||||||||||
s-kanaev marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||||||
| At least one failed assertion should be reported. The assertion failure message | ||||||||||
s-kanaev marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||||||
| is printed to `stderr` by SYCL Runtime. | ||||||||||
|
|
||||||||||
| When multiple kernels are enqueued and both fail at assertion at least single | ||||||||||
| assertion should be reported. | ||||||||||
s-kanaev marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||||||
|
|
||||||||||
|
|
||||||||||
| ## User requirements | ||||||||||
|
|
||||||||||
| From user's point of view there are the following requirements: | ||||||||||
|
|
||||||||||
| | # | Title | Description | Importance | | ||||||||||
| | - | ----- | ----------- | ---------- | | ||||||||||
| | 1 | Handle assertion failure | Signal about assertion failure via SYCL asynchronous exception | Must have | | ||||||||||
| | 2 | Print assert message | Assert function should print message to stderr at host | Must have | | ||||||||||
| | 3 | Stop under debugger | When debugger is attached, break at assertion point | Highly desired | | ||||||||||
| | 4 | Reliability | Assert failure should be reported regardless of kernel deadlock | Highly desired | | ||||||||||
|
|
||||||||||
s-kanaev marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||||||
|
|
||||||||||
| ## Contents of `cl::sycl::event_error` | ||||||||||
|
|
||||||||||
| Interface of `cl::sycl::event_error` should look like: | ||||||||||
s-kanaev marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||||||
| ``` | ||||||||||
| class event_error : public runtime_error { | ||||||||||
s-kanaev marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||||||
| public: | ||||||||||
| event_error() = default; | ||||||||||
|
|
||||||||||
| event_error(const char *Msg, cl_int Err) | ||||||||||
| : event_error(string_class(Msg), Err) {} | ||||||||||
|
|
||||||||||
| event_error(const string_class &Msg, cl_int Err) : runtime_error(Msg, Err) {} | ||||||||||
| }; | ||||||||||
| ``` | ||||||||||
|
|
||||||||||
| Regardless of whether asynchronous exception handler is set or not, there's an | ||||||||||
| action to be performed by SYCL Runtime. To achieve this, information about | ||||||||||
| assert failure should be propagated from device-side to SYCL Runtime. This | ||||||||||
| should be performed via calls to `clGetEventInfo` for OpenCL backend and | ||||||||||
s-kanaev marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||||||
| `zeEventQueryStatus` for Level-Zero backend. | ||||||||||
s-kanaev marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||||||
|
|
||||||||||
|
|
||||||||||
| ## Terms | ||||||||||
|
|
||||||||||
| - Device-side Runtime - part of device-code, which is supplied by Device-side | ||||||||||
| Compiler. | ||||||||||
s-kanaev marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||||||
| - Device-side Compiler - compiler which generates device-native binary image | ||||||||||
| based on input SPIR-V image. | ||||||||||
| - Low-level Runtime - the backend/runtime behind DPCPP Runtime. | ||||||||||
s-kanaev marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||||||
| - Accessor metadata - parts of accessor representation at device-side: pointer, | ||||||||||
| ranges, offset. | ||||||||||
|
|
||||||||||
|
|
||||||||||
| ## How it works? | ||||||||||
|
|
||||||||||
| For the time being, `assert(expr)` macro ends up in call to | ||||||||||
s-kanaev marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||||||
| `__devicelib_assert_fail` function. This function is part of [Device library extension](extensions/C-CXX-StandardLibrary/DeviceLibExtensions.rst#cl_intel_devicelib_cassert). | ||||||||||
| Device code already contains call to the function. Currently, a device-binary | ||||||||||
| is always linked against fallback implementation. | ||||||||||
|
|
||||||||||
|
|
||||||||||
| ### Device-specific approach | ||||||||||
s-kanaev marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||||||
|
|
||||||||||
| Device-side compiler/linker provides their implementation of `__devicelib_assert_fail` | ||||||||||
| and prefer this implementation over fallback one. | ||||||||||
|
|
||||||||||
| If Device-side Runtime supports `__devicelib_assert_fail` then Low-Level Runtime | ||||||||||
| is responsible for: | ||||||||||
s-kanaev marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||||||
| - detecting if assert failure took place; | ||||||||||
| - flushing assert message to `stderr` on host. | ||||||||||
s-kanaev marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||||||
| When detected, Low-level Runtime reports assert failure to DPCPP Runtime | ||||||||||
| at synchronization points. | ||||||||||
s-kanaev marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
s-kanaev marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||||||
|
|
||||||||||
| Refer to [OpenCL](extensions/Assert/opencl.md) and [Level-Zero](extensions/Assert/level-zero.md) | ||||||||||
| extensions. | ||||||||||
s-kanaev marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||||||
|
|
||||||||||
|
|
||||||||||
| ### Device-agnostic approach | ||||||||||
|
|
||||||||||
| If Device-side Runtime doesn't support `__devicelib_assert_fail` then a buffer | ||||||||||
| based approach comes in place. The approach doesn't require any support from | ||||||||||
| Device-side Runtime and Compiler. Neither it does from Low-level Runtime. | ||||||||||
|
|
||||||||||
| Within this approach, a dedicated assert buffer is allocated and implicit kernel | ||||||||||
| argument is introduced. The argument is an accessor with `discard_read_write` | ||||||||||
| or `discard_write` access mode. Accessor metadata is stored to program scope | ||||||||||
s-kanaev marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
s-kanaev marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||||||
| variable. This allows to refer to the accessor without modifying each and every | ||||||||||
| user's function. Fallback implementation of `__devicelib_assert_fail` restores | ||||||||||
| accessor metadata from program scope variable and writes assert information to | ||||||||||
| the assert buffer. Atomic operations are used in order to not overwrite existing | ||||||||||
| information. | ||||||||||
|
|
||||||||||
| DPCPP Runtime checks contents of the assert buffer for assert failure flag after | ||||||||||
| kernel finishes. | ||||||||||
|
|
||||||||||
| Both storing of accessor metadata and writing assert failure is performed with | ||||||||||
| help of built-ins. Implementations of these builtins are substituted by | ||||||||||
| frontend. | ||||||||||
|
|
||||||||||
| #### Built-ins operation | ||||||||||
|
|
||||||||||
| Accessor is a pointer augmented with offset and two ranges (access range and | ||||||||||
| memory range). | ||||||||||
|
|
||||||||||
| There are two built-ins provided by frontend: | ||||||||||
| * `__store_acc()` - to store accessor metadata into program-scope variable. | ||||||||||
| * `__store_assert_failure()` - to store flag about assert failure in a buffer | ||||||||||
| using the metadata stored in program-scope variable. | ||||||||||
|
|
||||||||||
| The accessor should be stored to program scope variable in global address space | ||||||||||
| using atomic operations. Motivation for using atomic operations: the program may | ||||||||||
| contain several kernels and some of them could be running simultaneously on a | ||||||||||
| single device. | ||||||||||
|
|
||||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Besides, it is not always possible to build a callgraph for a kernel, so, this should be implemented too:
Suggested change
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @kbobrovs: This is probably not how it will work. My understanding is that the extension to allow indirect calls in device code is not yet complete or approved. Last time we talked about this in the language evolution group, we identify ways to narrow down the set of functions that could possibly be called from any one call site. I expect that all parts of DPC++ that need to form a static call graph will use these techniques to limit the call graph when there are indirect function calls. Since the indirect function call extension isn't finished yet, it seems better to me that we don't speculate here on how we will handle that case.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @gmlueck, in future - maybe, but I'm looking at today's function pointers implementation which works and needs to be covered by this design. Once something changes with function pointers spec, we will update. Alternatively, the design can mention that the callgraph may be inaccurate in presence of function pointers, and compiler must mark a kernel as "using assert" always, unless it can prove (using future language hints or analysis) that the kernel can never call an indirectly callable function whose (possibly inaccurate) callgraph contains a call to
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added suggested description. |
||||||||||
| The `__store_assert_failure()` built-in atomically sets a flag in a buffer. The | ||||||||||
| buffer is accessed using accessor metadata from program-scope variable. This | ||||||||||
| built-in return a boolean value which is `true` if the flag is set by this call | ||||||||||
| to `__store_assert_failure()` and `false` if the flag was already set. | ||||||||||
| Motivation for using atomic operation is the same as with `__store_acc()` | ||||||||||
| builtin. | ||||||||||
|
|
||||||||||
| The following pseudo-code snippets shows how these built-ins are used. | ||||||||||
| First of all, assume the following code as user's one: | ||||||||||
| ``` | ||||||||||
| void user_func(int X) { | ||||||||||
| assert(X && “X is nil”); | ||||||||||
| } | ||||||||||
|
|
||||||||||
| int main() { | ||||||||||
| queue Q(...); | ||||||||||
| Q.submit([&] (handler& CGH) { | ||||||||||
| CGH.single_task([=] () { | ||||||||||
| do_smth(); | ||||||||||
| user_func(0); | ||||||||||
| do_smth_else(); | ||||||||||
| }); | ||||||||||
| }); | ||||||||||
| ... | ||||||||||
| } | ||||||||||
| ``` | ||||||||||
|
|
||||||||||
| The following LLVM IR pseudo code will be generated for the user's code: | ||||||||||
| ``` | ||||||||||
| @AssertBufferPtr = global void* null | ||||||||||
| @AssertBufferAccessRange = ... | ||||||||||
| @AssertBufferMemoryRange = ... | ||||||||||
| @AssertBufferOffset = ... | ||||||||||
|
|
||||||||||
| /// user's code | ||||||||||
| void user_func(int X) { | ||||||||||
| if (!(X && “X is nil")) { | ||||||||||
| __assert_fail(...); | ||||||||||
| } | ||||||||||
| } | ||||||||||
|
|
||||||||||
| users_kernel(...) { | ||||||||||
| do_smth() | ||||||||||
| user_func(0); | ||||||||||
| do_smth_else(); | ||||||||||
| } | ||||||||||
|
|
||||||||||
| /// a wrapped user's kernel | ||||||||||
s-kanaev marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||||||
| kernel(AssertBufferAccessor, OtherArguments...) { | ||||||||||
| __store_acc(AssertBufferAccessor); | ||||||||||
| users_kernel(OtherArguments...); | ||||||||||
| } | ||||||||||
|
|
||||||||||
| /// __assert_fail belongs to Linux version of devicelib | ||||||||||
| void __assert_fail(...) { | ||||||||||
| ... | ||||||||||
| __devicelib_assert_fail(...); | ||||||||||
| } | ||||||||||
|
|
||||||||||
| void __devicelib_assert_fail(Expr, File, Line, GlobalID, LocalID) { | ||||||||||
| ... | ||||||||||
| if (__store_assert_info()) | ||||||||||
| printf("Assertion `%s' failed in %s at line %i. GlobalID: %i, LocalID: %i", | ||||||||||
| Expr, File, Line, GlobalID, LocalID); | ||||||||||
s-kanaev marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||||||
| } | ||||||||||
|
|
||||||||||
| /// The following are built-ins provided by frontend | ||||||||||
| void __store_acc(accessor) { | ||||||||||
| %1 = accessor.getPtr(); | ||||||||||
| store void * %1, void * @AssertBufferPtr | ||||||||||
| } | ||||||||||
|
|
||||||||||
| bool __store_assert_info(...) { | ||||||||||
| AssertBAcc = __fetch_acc(); | ||||||||||
s-kanaev marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||||||
| // fill in data in AsBAcc | ||||||||||
| volatile int *Ptr = (volatile int *)AssertBAcc.getPtr(); | ||||||||||
| bool Expected = false; | ||||||||||
| bool Desired = true; | ||||||||||
|
|
||||||||||
| return atomic_cas(Ptr, Expected, Desired, SequentialConsistentMemoryOrder); | ||||||||||
| // or it could be: | ||||||||||
| // return !atomic_exchange(Ptr, Desired, SequentialConsistentMemoryOrder); | ||||||||||
| } | ||||||||||
| ``` | ||||||||||
|
|
||||||||||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,20 @@ | ||
| # Overview | ||
s-kanaev marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| This extension enables detection of assert failure of kernel. | ||
|
|
||
| # New enum value | ||
|
|
||
| `ze_result_t` enumeration should be augmented with `ZE_RESULT_ASSERT_FAILED` | ||
| enum element. This enum value indicated a detected assert failure at | ||
| device-side. | ||
|
|
||
| # Changed API | ||
|
|
||
| ``` | ||
| ze_event_handle_t Event; // describes an event of kernel been submitted previously | ||
| ze_result Result = zeEventQueryStatus(Event); | ||
| ``` | ||
|
|
||
| If kernel failed an assertion `zeEventQueryStatus` should return | ||
|
||
| `ZE_RESULT_ASSERT_FAILED`. | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,22 @@ | ||
| # Overview | ||
|
|
||
| This extension enables detection of assert failure of kernel. | ||
s-kanaev marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| # New error code | ||
|
|
||
| `CL_ASSERT_FAILURE` is added to indicate a detected assert failure at | ||
| device-side. | ||
|
|
||
| # Changed API | ||
|
|
||
| ``` | ||
| cl_event Event; // describes an event of kernel been submitted previously | ||
| cl_int Result; | ||
| size_t ResultSize; | ||
|
|
||
| clGetEventInfo(Event, CL_EVENT_COMMAND_EXECUTION_STATUS, sizeof(Result), &Result, &ResultSize); | ||
| ``` | ||
|
|
||
| If kernel failed an assertion `clGetEventInfo` should put `CL_ASSERT_FAILURE` | ||
| in `Result`. | ||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.