UPSTREAM PR #17487: webui: MCP client with low coupling to current codebase by loci-dev · Pull Request #316 · auroralabs-loci/llama.cpp

loci-dev · 2025-11-25T06:44:17Z

Make sure to read the contributing guidelines before submitting a PR

multi-transport MCP client
full agentic orchestrator
isolated, idempotent singleton initialization
typed SSE client
normalized tool-call accumulation pipeline
integrated reasoning, timings, previews, and turn-limit handling
complete UI section for MCP configuration
dedicated controls for relevant parameters
opt-in ChatService integration that does not interfere with existing flows

TODO: increase coupling with the UI for structured tool-call result rendering, including integrated display components and support for sending out-of-context images (persistence/storage still to be defined).

loci-review · 2025-11-25T07:28:18Z

Explore the complete analysis inside the Version Insights

Performance Analysis Summary

Analysis Scope: PR #316 - MCP Client Integration for llama.cpp WebUI
Versions Compared: 930f177b-2868-453d-809a-8c06d2215f50 vs d55f4145-0a3a-4b89-9c31-ba206b13d74b

Summary

This PR introduces MCP client functionality exclusively in the WebUI frontend layer (TypeScript/Svelte). Analysis of the actual performance data shows zero measurable impact on core inference functions. All changes are isolated to browser-side JavaScript code with no modifications to the C++ inference engine. Power consumption measurements across all binaries show 0.0% change, confirming no performance regression in the compiled artifacts.

The code review identified 2,338 lines of new frontend code implementing agentic tool-calling workflows. The integration point in ChatService uses an opt-in pattern that bypasses the new code path when MCP is not configured, preserving existing behavior. No performance-critical functions from the project summary (llama_decode, llama_tokenize, llama_model_load_from_file, ggml_backend_graph_compute) were modified.

Function-level metrics for llama_decode show throughput of 69 ns in both versions with response time of 44,722,748 ns vs 44,722,492 ns (256 ns difference, 0.0006% change). The llama_tokenize function maintains 22 ns throughput with response time of 898,714 ns vs 898,716 ns (2 ns difference). These sub-microsecond variations are within measurement noise and indicate no functional changes to the inference pipeline.

Tokens per Second Impact: None. No inference functions modified.

Power Consumption: All binaries show 0.0% change (libllama.so: 228,744 nJ both versions).

Conclusion: This PR adds optional frontend functionality with zero performance impact on core inference operations.

…constants

…d per-chat overrides

…state

…ions

loci-review · 2026-01-03T17:33:18Z

Explore the complete analysis inside the Version Insights

Perfect! I've generated the summary report for your project. Here are the key findings:

Summary Report for llama.cpp PR #316

Project: auroralabs-loci/llama.cpp
Pull Request: #316

Key Finding: ✅ No Performance Regressions Detected

The performance analysis comparing the base version to the target version shows:

No modified functions with performance changes greater than 2%
Both response time and throughput time remain stable
All changes are within normal variance thresholds

Conclusion

This pull request passes the performance review with no concerns. The changes maintain performance stability and are safe to merge from a performance perspective. You can proceed with other review criteria (functionality, code quality, security) with confidence that performance has not been negatively impacted.

loci-review · 2026-01-03T18:25:22Z

Explore the complete analysis inside the Version Insights

Here's the summary report for your project:

Summary Report

Project Details:

Repository: llama.cpp
Owner: auroralabs-loci
Pull Request: UPSTREAM PR #17487: webui: MCP client with low coupling to current codebase #316
Project ID: 2621b8c0-b5ce-11f0-b333-453f42058aa1
Report ID: 20e48e20-e8cb-11f0-81f2-dbb430499cb5

Version Comparison:

Base Version: 24e2db51-e8bf-11f0-81f2-dbb430499cb5
Target Version: 20e48e21-e8cb-11f0-81f2-dbb430499cb5

Performance Analysis Results

Key Finding: ✅ No Significant Performance Impact Detected

The analysis shows that no modified functions were found with performance changes greater than 2% for either:

Response Time (execution time per function call)
Throughput Time (time spent in function including callees)

Interpretation

This is a positive result indicating that Pull Request #316 introduces changes that:

Maintain Performance Stability - The code modifications do not introduce performance regressions
No Measurable Degradation - Response times and throughput remain within acceptable variance (< 2%)
Safe to Merge - From a performance perspective, this PR does not negatively impact the llama.cpp codebase

Recommendation

Based on the performance analysis, this pull request appears to be performance-neutral and should not cause any concerns from a runtime efficiency standpoint. The changes can proceed through the review process without performance-related blockers.

loci-dev temporarily deployed to PROD__AL_DEMO November 25, 2025 06:44 — with GitHub Actions Inactive

loci-dev force-pushed the main branch from 50965d2 to 09e03e7 Compare November 25, 2025 07:09

loci-dev force-pushed the main branch 27 times, most recently from 7475023 to fc0f51d Compare November 29, 2025 18:11

allozaur and others added 28 commits January 3, 2026 16:25

refactor: Update Agentic and MCP config parsing to use new utils and …

2b1d783

…constants

feat: Add @modelcontextprotocol/sdk and zod dependencies

609723f

feat: Refactor MCP client to use official SDK

a0a2f09

feat: Introduce reactive mcpStore for client lifecycle management

5b124ef

feat: Implement agentic orchestration within ChatService

5eeb381

refactor: Update ChatStore to leverage mcpStore for agentic flow

783f170

feat: Add AgenticContent component for enhanced tool call rendering

238f758

docs: Update high-level architecture diagrams for MCP integration

619a906

refactor: Tool call handling

1165554

feat: Raw LLM output switch per message

faeefc6

refactor: Consolidate UI CSS classes into shared module

bf679e0

feat: Add McpLogo Svelte component

142aa54

feat: Enhance server config with headers and schema normalization

3bd7b90

refactor: Centralize health check logic in store

945a429

feat: Implement dedicated server management UI components

b19cd24

feat: Integrate server management dialog into chat settings

761efb2

feat: Display and manage servers in ChatForm actions

40ddb83

feat: Enhance tool call streaming UI and output format

3745efa

feat: Implement lazy MCP client shutdown

b4a407e

feat: Add image load error fallback in MarkdownContent

2e2400f

feat: Add per-chat MCP server overrides

3aa8b37

feat: Enhance MCP server dropdown with search, popularity sorting, an…

7635da1

…d per-chat overrides

feat: Improve agentic tool call streaming display with 'in progress' …

51feac0

…state

webui: remove unused imports

a084a0f

webui: remove legacy wrapper and restore WebSocket transport

602cf53

chore: update webui build output

8648cd1

webui: split raw output into backend parsing and frontend display opt…

74bccba

…ions

chore: update webui build output

ac4f61f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

UPSTREAM PR #17487: webui: MCP client with low coupling to current codebase#316

UPSTREAM PR #17487: webui: MCP client with low coupling to current codebase#316
loci-dev wants to merge 32 commits intomainfrom
upstream-PR17487-branch_ServeurpersoCom-mcp-client

loci-dev commented Nov 25, 2025

Uh oh!

loci-review bot commented Nov 25, 2025

Uh oh!

loci-review bot commented Jan 3, 2026

Uh oh!

loci-review bot commented Jan 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

loci-dev commented Nov 25, 2025

Uh oh!

loci-review bot commented Nov 25, 2025

Performance Analysis Summary

Summary

Uh oh!

loci-review bot commented Jan 3, 2026

Summary Report for llama.cpp PR #316

Key Finding: ✅ No Performance Regressions Detected

Conclusion

Uh oh!

loci-review bot commented Jan 3, 2026

Summary Report

Performance Analysis Results

Interpretation

Recommendation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants