UPSTREAM PR #17878: server : run child server on localhost by loci-dev · Pull Request #496 · auroralabs-loci/llama.cpp

loci-dev · 2025-12-09T05:38:07Z

When passing in --host 0.0.0.0, the child runs on host 0.0.0.0 and the router tries to access it at 0.0.0.0. I can't think of why the child should not always run on 127.0.0.1.

get_free_port() binds to INADDR_ANY, which should select a port that is available across all interfaces. This can be changed to INADDR_LOOPBACK if we ensure the child will only ever bind to 127.0.0.1. If not, then INADDR_ANY is a safe choice.

fixes #17862

loci-review · 2025-12-09T06:21:58Z

Explore the complete analysis inside the Version Insights

Performance Analysis Summary: PR #496

Overview

This PR implements a networking configuration fix for the llama.cpp server router, forcing child server instances to bind exclusively to localhost (127.0.0.1) rather than inheriting the router's host configuration. The changes span 2 files with 7 line additions and 2 deletions, modifying only server infrastructure code without touching inference or tokenization logic.

Performance Impact

No measurable performance impact detected. Power consumption analysis across all binaries shows changes below 0.001%:

build.bin.libllama.so: 0.22 nJ reduction (194,204 nJ baseline)
build.bin.llama-run: 1.48 nJ reduction (219,166 nJ baseline)
build.bin.llama-cvector-generator: 0.95 nJ increase (249,477 nJ baseline)
build.bin.llama-tts: 0.73 nJ increase (253,600 nJ baseline)
All other binaries: 0.0% change

Inference Performance: No impact on tokens per second. The modified code paths (server-models.cpp, server-models.h) handle only HTTP routing, child process management, and network configuration. Core inference functions (llama_decode, llama_encode, llama_tokenize) remain unchanged. No modifications to model loading, tokenization, sampling, KV cache, or computational graph execution.

Code Changes

The PR adds a hostname field to server_model_meta structure and implements three key modifications:

Sets inst.meta.hostname = "127.0.0.1" during model instance initialization
Passes --host 127.0.0.1 explicitly to child server processes via command-line arguments
Updates proxy connection logic to use meta->hostname instead of base_params.hostname

These changes resolve a routing failure where child servers bound to 0.0.0.0 were unreachable by the router. The fix is purely correctness-focused with no computational overhead—localhost connections have identical latency characteristics to the previous configuration, and the additional command-line argument adds negligible parsing overhead.

Security improvement: Child servers are no longer exposed on external network interfaces, reducing attack surface while maintaining functional equivalence for local routing scenarios.

server : run child server on localhost

b8c9cdb

loci-dev temporarily deployed to PROD__AL_DEMO December 9, 2025 05:38 — with GitHub Actions Inactive

loci-dev force-pushed the main branch 27 times, most recently from 4f731df to 8e6f6e8 Compare December 12, 2025 15:09

loci-dev force-pushed the main branch 30 times, most recently from b9ba67d to 320a1fc Compare December 17, 2025 09:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

UPSTREAM PR #17878: server : run child server on localhost#496

UPSTREAM PR #17878: server : run child server on localhost#496
loci-dev wants to merge 1 commit intomainfrom
upstream-PR17878-branch_aldehir-server/fix-router-inaddr-any

loci-dev commented Dec 9, 2025

Uh oh!

loci-review bot commented Dec 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

loci-dev commented Dec 9, 2025

Uh oh!

loci-review bot commented Dec 9, 2025

Performance Analysis Summary: PR #496

Overview

Performance Impact

Code Changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants