Conversation
|
@slaren Since we can put SYCL related code under a directory instead of a single file, I might introduce headers-only library for performance optimization, as well as simplifying our effort too (my job during work time 😁 ) @ggerganov @mingfeima for aware |
I think that's good, I plan to start using CUTLASS in the CUDA backend as well. |
|
It's great to see the new structure. |
|
This is a good refactoring, and would be helpful for debug . I would suggest waiting for some iq quant prs and then resume work on this. |
yes, drop a note when you finished. |
|
|
for sub folder: dpct
|
|
There was a lot of changes which caused conflict. Unable to see how to easily resolve it. @airMeng can you see if there is much that needs to be fixed? |
never mind. just more time needed. |
|
@NeoZhangJianyu @luoyu-intel @AidanBeltonS refactored after #7710, please have a review. |
luoyu-intel
left a comment
There was a problem hiding this comment.
Why still keep some kernels in the ggml-sycl.cpp file?
ggml-sycl.h
Outdated
|
|
||
| #define GGML_SYCL_MAX_DEVICES 48 | ||
| #define GGML_SYCL_NAME "SYCL" | ||
|
|
There was a problem hiding this comment.
These two macro would be used by external caller.
So, don't move to presets.hpp.
Suggest all exports function/variable/macro are defined in ggml-sycl.h
There was a problem hiding this comment.
LGTM! ping @joeatodd @ggerganov for a look when available.








according to #5277 (reply in thread), the PR does the following:
- [ ] let the common backend to handle H2D/D2H memcpy.let the PR as simple as possible