metal : fix out-of-bounds access + style changes#2416
Conversation
|
I just checked that for a 33B model, which has Not sure if there are more bugs. I am still waiting to download a 65B model to test if this change will work. |
|
Change I can complete the execution of the |
|
Confirmed that above changes result in ggml_metal_graph_find_concurrency() running to completion with 65B |
|
Need to check if concurrency feature still works after #2411 |
|
It should work as before, since the graph allocator is disabled with Metal. If it works with the allocator it will only be by chance, there is no guarantee that the operations executed concurrently won't write to the same memory address. |
|
The concurrency feature does work. I just checked it. |
|
If the allocator works with Metal and disabled concurrency optimization, it would be better to enable it and try to implement a smarter concurrency logic that would be compatible with the allocator. |
|
The allocator should work with Metal without concurrency if |
|
I think the allocator gains are bigger compared to the concurrency optimization, so we should enable it for Metal builds, and try to improve it to do concurrency detection as you suggest. |
|
Actually it’s quite easy to make concurrency optimization work with the new allocator. We only need to pass We will use a little bit more memory because we have to reserve memory for nodes that can be issued concurrently:
I think we can merge this PR for now. I may open a separate PR later this week to make concurrency optimization work with the new allocator. |
|
OK, lets apply the array size fix and merge |
ref #2413
This line could perform out-of-bounds access when the
concur_listelement is-1:However, there is still bug in the logic. See #2413 for description.
For now, I've disabled the concurrency optimization on
master. We should try to fix and re-enable it