metal : fix out-of-bounds access + style changes by ggerganov · Pull Request #2416 · ggml-org/llama.cpp

ggerganov · 2023-07-27T07:12:15Z

This line could perform out-of-bounds access when the concur_list element is -1:

if (gf->nodes[ctx->concur_list[j]] == src_cur) {is_found = 1; break;}

However, there is still bug in the logic. See #2413 for description.
For now, I've disabled the concurrency optimization on master. We should try to fix and re-enable it

lshzh-ww · 2023-07-27T14:40:49Z

I just checked that for a 33B model, which has n_nodes 2224, concur_list_len is 3668. For 65B model with n_nodes 2964 we certainly hit the maximum length of concur_list. We should set the length of concur_list to 2*GGML_MAX_NODES. In this case, even if there is no concurrency between the nodes and we have to insert a barrier between every two operations, we still won't hit the limit.

Not sure if there are more bugs. I am still waiting to download a 65B model to test if this change will work.

lshzh-ww · 2023-07-27T15:33:36Z

Change int concur_list[GGML_MAX_NODES]; to int concur_list[2 * GGML_MAX_NODES]; in ggml_metal_context. Also change following lines in ggml_metal_graph_find_concurrency accordingly.

for (int i = 0; i < 2 * GGML_MAX_NODES; i++) { ctx->concur_list[i] = 0; }
......
if (ctx->concur_list_len > 2 * GGML_MAX_NODES) {
    fprintf(stderr, "%s: too many elements for metal ctx->concur_list!\n", __func__);
}

I can complete the execution of the ggml_metal_graph_find_concurrency with suggested changes. My laptop only has 32GB memory so I can't really run the inference. Still we need someone with 64GB memory to test if this change works.

ProjectAtlantis-dev · 2023-07-30T22:06:26Z

Confirmed that above changes result in ggml_metal_graph_find_concurrency() running to completion with 65B

ggerganov · 2023-07-31T07:28:43Z

Need to check if concurrency feature still works after #2411

slaren · 2023-07-31T14:04:19Z

It should work as before, since the graph allocator is disabled with Metal. If it works with the allocator it will only be by chance, there is no guarantee that the operations executed concurrently won't write to the same memory address.

lshzh-ww · 2023-07-31T14:24:43Z

The concurrency feature does work. I just checked it.

ggerganov · 2023-07-31T15:58:39Z

If the allocator works with Metal and disabled concurrency optimization, it would be better to enable it and try to implement a smarter concurrency logic that would be compatible with the allocator.

slaren · 2023-07-31T16:02:08Z

The allocator should work with Metal without concurrency if buf_alloc is registered with ggml_metal_add_buffer. However, this is not a problem that the concurrency algorithm can solve, this needs to be fixed in the allocator so that concurrent operations are assigned different addresses.

ggerganov · 2023-07-31T16:08:24Z

I think the allocator gains are bigger compared to the concurrency optimization, so we should enable it for Metal builds, and try to improve it to do concurrency detection as you suggest.

lshzh-ww · 2023-08-01T06:06:56Z

Actually it’s quite easy to make concurrency optimization work with the new allocator. We only need to pass metal_ctx->concur_list to the allocator, and then let the allocator parse all the nodes in order of metal_ctx->concur_list.

We will use a little bit more memory because we have to reserve memory for nodes that can be issued concurrently:

Memory difference	n_batch=512	n_batch=128
33B n_ctx=512	+21MB	+5.3MB
33B n_ctx=2048	+27MB	+6.8MB

I think we can merge this PR for now. I may open a separate PR later this week to make concurrency optimization work with the new allocator.

ggerganov · 2023-08-01T08:06:49Z

OK, lets apply the array size fix and merge

metal : fix out-of-bounds access + style changes

1038d1d

ggerganov requested a review from lshzh-ww July 27, 2023 07:12

ggerganov mentioned this pull request Jul 27, 2023

Graph build hangs during model warmup #2413

Closed

lshzh-ww approved these changes Jul 27, 2023

View reviewed changes

metal : increase concurrency nodes to 2*GGML_MAX_NODES

30ea0e1

ggerganov merged commit f6f9896 into master Aug 7, 2023

ggerganov deleted the fix-concurrency branch August 7, 2023 07:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

metal : fix out-of-bounds access + style changes#2416

metal : fix out-of-bounds access + style changes#2416
ggerganov merged 2 commits intomasterfrom
fix-concurrency

ggerganov commented Jul 27, 2023 •

edited

Loading

Uh oh!

lshzh-ww commented Jul 27, 2023

Uh oh!

lshzh-ww commented Jul 27, 2023 •

edited

Loading

Uh oh!

ProjectAtlantis-dev commented Jul 30, 2023

Uh oh!

ggerganov commented Jul 31, 2023

Uh oh!

slaren commented Jul 31, 2023

Uh oh!

lshzh-ww commented Jul 31, 2023

Uh oh!

ggerganov commented Jul 31, 2023

Uh oh!

slaren commented Jul 31, 2023 •

edited

Loading

Uh oh!

ggerganov commented Jul 31, 2023

Uh oh!

lshzh-ww commented Aug 1, 2023 •

edited

Loading

Uh oh!

ggerganov commented Aug 1, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

ggerganov commented Jul 27, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lshzh-ww commented Jul 27, 2023

Uh oh!

lshzh-ww commented Jul 27, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ProjectAtlantis-dev commented Jul 30, 2023

Uh oh!

ggerganov commented Jul 31, 2023

Uh oh!

slaren commented Jul 31, 2023

Uh oh!

lshzh-ww commented Jul 31, 2023

Uh oh!

ggerganov commented Jul 31, 2023

Uh oh!

slaren commented Jul 31, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ggerganov commented Jul 31, 2023

Uh oh!

lshzh-ww commented Aug 1, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ggerganov commented Aug 1, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ggerganov commented Jul 27, 2023 •

edited

Loading

lshzh-ww commented Jul 27, 2023 •

edited

Loading

slaren commented Jul 31, 2023 •

edited

Loading

lshzh-ww commented Aug 1, 2023 •

edited

Loading