You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: CodeGen/docker_compose/intel/cpu/xeon/README.md
+84-68Lines changed: 84 additions & 68 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,7 @@
1
1
# Build MegaService of CodeGen on Xeon
2
2
3
3
This document outlines the deployment process for a CodeGen application utilizing the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice pipeline on Intel Xeon server. The steps include Docker images creation, container deployment via Docker Compose, and service execution to integrate microservices such as `llm`. We will publish the Docker images to Docker Hub soon, further simplifying the deployment process for this service.
4
+
The default pipeline deploys with vLLM as the LLM serving component. It also provides options of using TGI backend for LLM microservice.
4
5
5
6
## 🚀 Create an AWS Xeon Instance
6
7
@@ -10,55 +11,6 @@ For detailed information about these instance types, you can refer to [m7i](http
10
11
11
12
After launching your instance, you can connect to it using SSH (for Linux instances) or Remote Desktop Protocol (RDP) (for Windows instances). From there, you'll have full access to your Xeon server, allowing you to install, configure, and manage your applications as needed.
12
13
13
-
## 🚀 Download or Build Docker Images
14
-
15
-
Should the Docker image you seek not yet be available on Docker Hub, you can build the Docker image locally.
To construct the Mega Service, we utilize the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice pipeline within the `codegen.py` Python script. Build MegaService Docker image via the command below:
Then run the command `docker images`, you will have the following Docker Images:
56
-
57
-
-`opea/llm-textgen:latest`
58
-
-`opea/codegen:latest`
59
-
-`opea/codegen-ui:latest`
60
-
-`opea/codegen-react-ui:latest` (optional)
61
-
62
14
## 🚀 Start Microservices and MegaService
63
15
64
16
The CodeGen megaservice manages a single microservice called LLM within a Directed Acyclic Graph (DAG). In the diagram above, the LLM microservice is a language model microservice that generates code snippets based on the user's input query. The TGI service serves as a text generation interface, providing a RESTful API for the LLM microservice. The CodeGen Gateway acts as the entry point for the CodeGen application, invoking the Megaservice to generate code snippets in response to the user's input query.
@@ -89,42 +41,55 @@ flowchart LR
89
41
90
42
Since the `compose.yaml` will consume some environment variables, you need to setup them in advance as below.
91
43
92
-
**Append the value of the public IP address to the no_proxy list**
44
+
1. set the host_ip and huggingface token
93
45
46
+
> Note:
47
+
> Please replace the `your_ip_address` with you external IP address, do not use `localhost`.
Note: Please replace the `host_ip` with you external IP address, do not use `localhost`.
111
-
112
64
### Start the Docker Containers for All Services
113
65
66
+
CodeGen support TGI service and vLLM service, you can choose start either one of them.
67
+
68
+
Start CodeGen based on TGI service:
114
69
```bash
115
-
cd GenAIExamples/CodeGen/docker_compose/intel/cpu/xeon
116
-
docker compose up -d
70
+
cd GenAIExamples/CodeGen/docker_compose
71
+
source set_env.sh
72
+
cd intel/cpu/xeon
73
+
docker compose --profile codegen-xeon-tgi up -d
74
+
```
75
+
76
+
Start CodeGen based on vLLM service:
77
+
```bash
78
+
cd GenAIExamples/CodeGen/docker_compose
79
+
source set_env.sh
80
+
cd intel/cpu/xeon
81
+
docker compose --profile codegen-xeon-vllm up -d
117
82
```
118
83
119
84
### Validate the MicroServices and MegaService
120
85
121
-
1.TGI Service
86
+
1.LLM Service (for TGI, vLLM)
122
87
123
88
```bash
124
-
curl http://${host_ip}:8028/generate \
125
-
-X POST \
126
-
-d '{"inputs":"Implement a high-level API for a TODO list application. The API takes as input an operation request and updates the TODO list in place. If the request is invalid, raise an exception.","parameters":{"max_new_tokens":256, "do_sample": true}}' \
127
-
-H 'Content-Type: application/json'
89
+
curl http://${host_ip}:8028/v1/chat/completions \
90
+
-X POST \
91
+
-d '{"model": "Qwen/Qwen2.5-Coder-7B-Instruct", "messages": [{"role": "user", "content": "Implement a high-level API for a TODO list application. The API takes as input an operation request and updates the TODO list in place. If the request is invalid, raise an exception."}], "max_tokens":32}' \
92
+
-H 'Content-Type: application/json'
128
93
```
129
94
130
95
2. LLM Microservices
@@ -257,3 +222,54 @@ For example:
257
222
- Ask question and get answer
258
223
259
224

225
+
226
+
227
+
## 🚀 Download or Build Docker Images
228
+
229
+
Should the Docker image you seek not yet be available on Docker Hub, you can build the Docker image locally.
To construct the Mega Service, we utilize the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice pipeline within the `codegen.py` Python script. Build MegaService Docker image via the command below:
0 commit comments