Add test for vk_buffer from host memory#19254
Add test for vk_buffer from host memory#19254sredman wants to merge 5 commits intoggml-org:masterfrom
Conversation
|
I don't mind adding further backend-internal tests, but I think we should move them out of the main file. I plan to look into splitting it up in the future. Maybe we can start here by adding a ggml-vulkan-tests.cpp and .hpp? You don't have to move my tests as well, but you can if you want. |
Do you have the shape of a plan for this? When I was originally working on this test, I tried to make test file in the ggml/tests path. That would have required an extensive refactoring in order to be able to reference the internal functions. We would need to mark any functions we want to call as non-static and export them. However, the more messy problem was that we'd need to extract all the struct definitions which we want to use, which is quite an undertaking. It could certainly be done, but would definitely need to be prepared and checked in quickly to avoid mass merge conflicts 🙂 |
|
That makes sense, I hadn't looked into it enough yet. Then I'll postpone that until I get to it myself. However, please move your test into the existing |
…ML_VULKAN_RUN_TESTS block
…ation, move callsite to ggml_vk_buffer_from_host_ptr
…er test call location
|
@0cc4m , I have moved the test declaration into the preexisting block of tests, and the call into the preexisting block of calls (in |
There was a problem hiding this comment.
Note that this abort needs to be commented out in order to run the new tests.
There was a problem hiding this comment.
Feel free to remove it, I must have accidentally merged it. You can also move your test to the top, provided it fails non-fatally. The matmul tests take a while to run, and I haven't really needed them anymore in a long time.
0cc4m
left a comment
There was a problem hiding this comment.
Sorry about the delay. I ran the test and it works. It showed me that buffer_from_host_ptr does not work on my Intel iGPU, at least not in this way. I went through the code and added some comments. Once those are resolved it's good to merge.
| static std::vector<vk_device> tested_devices; | ||
| if (std::find(tested_devices.begin(), tested_devices.end(), device) == tested_devices.end()) { | ||
| tested_devices.push_back(device); | ||
| } else{ | ||
| return; // Already tested this device, skip | ||
| } |
There was a problem hiding this comment.
The existing tests just run once and abort, it's not intended to keep them active during runtime, so this test isn't necessary.
| return; // Already tested this device, skip | ||
| } | ||
|
|
||
| VK_LOG_DEBUG("ggml_vk_test_single_device_external_memory(" << device->name << ") - Single-device external host memory test"); |
There was a problem hiding this comment.
The debug name is inconsistent with the function name, and the additional text is unusual. Was that to make it easier to spot in the debug log?
| std::unique_ptr<float[], decltype(input_deleter)> input_host_memory( | ||
| static_cast<float*>(aligned_alloc(min_alignment, aligned_size)), | ||
| input_deleter | ||
| ); |
There was a problem hiding this comment.
This is a little convoluted compared to just going float * input_host_memory = (float *)aligned_alloc(...) and adding the free() call to the end, but if you prefer this, I don't mind.
However, to make it easier to understand what is going on, please check if the alloc succeeded and print an error message if not. Same for the other alloc below.
| vk_buffer device_dst_buffer = ggml_vk_create_buffer_device(device, test_buffer_size); | ||
|
|
||
| // Allocate an external memory buffer on device, importing the host memory | ||
| vk_buffer external_source_buffer = ggml_vk_buffer_from_host_ptr(device, input_host_memory.get(), test_buffer_size); |
There was a problem hiding this comment.
Same thing as above, to make this easier to understand please check if this actually succeeded and print an error if not. In my test it returned a nullptr and this just crashed the copy_async functions below, so I needed a debugger to figure out which function failed.
| free(d_chk); | ||
| } | ||
|
|
||
| static void ggml_vk_test_single_device_buffer_from_host_ptr(vk_device& device) { |
There was a problem hiding this comment.
| static void ggml_vk_test_single_device_buffer_from_host_ptr(vk_device& device) { | |
| static void ggml_vk_test_device_buffer_from_host_ptr(vk_device& device) { |
There was a problem hiding this comment.
Feel free to remove it, I must have accidentally merged it. You can also move your test to the top, provided it fails non-fatally. The matmul tests take a while to run, and I haven't really needed them anymore in a long time.
Add a test to validate external host memory can correctly be allocated, written, used, and read.
I already had this test for my own external memory buffer solution, but then I noticed #18467 had done the work. I thought the test might still be useful.