Skip to content

Added llamacpp model runtime with docker based deployment#85

Open
AhmedSeemalK wants to merge 1 commit intoopea-project:devfrom
AhmedSeemalK:llamacpp-docker
Open

Added llamacpp model runtime with docker based deployment#85
AhmedSeemalK wants to merge 1 commit intoopea-project:devfrom
AhmedSeemalK:llamacpp-docker

Conversation

@AhmedSeemalK
Copy link
Copy Markdown
Collaborator

This pull request introduces a new Dockerfile and README for running llama.cpp with Intel oneAPI compilers and oneMKL BLAS, providing optimized CPU inference for the llama-server. The Dockerfile sets up the environment, builds llama.cpp from source with Intel optimizations, and configures the container for easy model serving. The README gives step-by-step instructions for building, running, and testing the container.

Documentation and usage instructions:

  • Added README.md with detailed build, run, and test instructions for the new Dockerfile. The README explains how to build the image, start and stop the container, configure model caching, access logs, and test the server endpoint.

Added llamacpp as blueprint with docker based deployment

Signed-off-by: AhmedSeemalK <ahmed.seemal@intel.com>
@AhmedSeemalK AhmedSeemalK requested a review from psurabh April 7, 2026 07:48
@AhmedSeemalK AhmedSeemalK changed the title Added llamacpp model runtime as docker based deployment Added llamacpp model runtime with docker based deployment Apr 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant