Added llamacpp model runtime with docker based deployment by AhmedSeemalK · Pull Request #85 · opea-project/Enterprise-Inference

AhmedSeemalK · 2026-04-07T07:48:30Z

This pull request introduces a new Dockerfile and README for running llama.cpp with Intel oneAPI compilers and oneMKL BLAS, providing optimized CPU inference for the llama-server. The Dockerfile sets up the environment, builds llama.cpp from source with Intel optimizations, and configures the container for easy model serving. The README gives step-by-step instructions for building, running, and testing the container.

Documentation and usage instructions:

Added README.md with detailed build, run, and test instructions for the new Dockerfile. The README explains how to build the image, start and stop the container, configure model caching, access logs, and test the server endpoint.

Added llamacpp as blueprint with docker based deployment Signed-off-by: AhmedSeemalK <ahmed.seemal@intel.com>

Add llama.cpp Dockerfile and README

7abdd0c

Added llamacpp as blueprint with docker based deployment Signed-off-by: AhmedSeemalK <ahmed.seemal@intel.com>

AhmedSeemalK requested a review from psurabh April 7, 2026 07:48

AhmedSeemalK changed the title ~~Added llamacpp model runtime as docker based deployment~~ Added llamacpp model runtime with docker based deployment Apr 7, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added llamacpp model runtime with docker based deployment#85

Added llamacpp model runtime with docker based deployment#85
AhmedSeemalK wants to merge 1 commit intoopea-project:devfrom
AhmedSeemalK:llamacpp-docker

AhmedSeemalK commented Apr 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

AhmedSeemalK commented Apr 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant