Skip to content

UPSTREAM PR #17174: server: (refactor) implement generator-based API for task results#170

Closed
DajanaV wants to merge 6 commits intomainfrom
upstream-PR17174-branch_ngxson-xsn/server_response_generator_refactor
Closed

UPSTREAM PR #17174: server: (refactor) implement generator-based API for task results#170
DajanaV wants to merge 6 commits intomainfrom
upstream-PR17174-branch_ngxson-xsn/server_response_generator_refactor

Conversation

@DajanaV
Copy link
Copy Markdown
Collaborator

@DajanaV DajanaV commented Nov 11, 2025

Mirrored from ggml-org/llama.cpp#17174

This PR adds a generator-based API for receiving task results. It aims to reduce the usage of callback function, making the code looks more "linear", easier to follow.

This also allowing to return correct HTTP error code in streaming case, ref: ggml-org/llama.cpp#16486 (comment)

Example:

server_response_generator gen(ctx_server);
{
    std::vector<server_task> tasks;
    // ... populate tasks ...
    gen.post_tasks(std::move(tasks));
}

// wait for the results
auto all_results = gen.wait_for_all(req.is_connection_closed);

// collect results
if (all_results.is_terminated) {
    return; // connection is closed
} else if (all_results.error) {
    res_error(res, all_results.error->to_json());
    return;
} else {
    for (auto & res : all_results.results) {
        GGML_ASSERT(dynamic_cast<server_task_result_embd*>(res.get()) != nullptr);
        responses.push_back(res->to_json());
    }
}

@DajanaV DajanaV force-pushed the main branch 16 times, most recently from 24733fb to 4b4bb7c Compare November 13, 2025 12:15
@DajanaV DajanaV closed this Nov 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants