Skip to content

Commit 3d99bc0

Browse files
committed
Update README.md for Deepseek support and numbers of required gaudi cards
Signed-off-by: Tsai, Louie <louie.tsai@intel.com>
1 parent ecb7f7b commit 3d99bc0

1 file changed

Lines changed: 29 additions & 12 deletions

File tree

comps/llms/src/text-generation/README.md

Lines changed: 29 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -8,14 +8,29 @@ Overall, this microservice offers a streamlined way to integrate large language
88

99
## Validated LLM Models
1010

11-
| Model | TGI-Gaudi | vLLM-CPU | vLLM-Gaudi |
12-
| --------------------------- | --------- | -------- | ---------- |
13-
| [Intel/neural-chat-7b-v3-3] ||||
14-
| [Llama-2-7b-chat-hf] ||||
15-
| [Llama-2-70b-chat-hf] || - ||
16-
| [Meta-Llama-3-8B-Instruct] ||||
17-
| [Meta-Llama-3-70B-Instruct] || - ||
18-
| [Phi-3] | x | Limit 4K | Limit 4K |
11+
| Model | TGI-Gaudi | vLLM-CPU | vLLM-Gaudi |
12+
| ------------------------------------------- | --------- | -------- | ---------- |
13+
| [Intel/neural-chat-7b-v3-3] ||||
14+
| [meta-llama/Llama-2-7b-chat-hf] ||||
15+
| [meta-llama/Llama-2-70b-chat-hf] || - ||
16+
| [meta-llama/Meta-Llama-3-8B-Instruct] ||||
17+
| [meta-llama/Meta-Llama-3-70B-Instruct] || - ||
18+
| [Phi-3] | x | Limit 4K | Limit 4K |
19+
| [deepseek-ai/DeepSeek-R1-Distill-Llama-70B] || - ||
20+
| [deepseek-ai/DeepSeek-R1-Distill-Qwen-32B] || - ||
21+
22+
### System Requirements for LLM Models
23+
24+
| Model | Minimum number of Gaudi cards |
25+
| ------------------------------------------- | ----------------------------- |
26+
| [Intel/neural-chat-7b-v3-3] | 1 |
27+
| [meta-llama/Llama-2-7b-chat-hf] | 1 |
28+
| [meta-llama/Llama-2-70b-chat-hf] | 2 |
29+
| [meta-llama/Meta-Llama-3-8B-Instruct] | 1 |
30+
| [meta-llama/Meta-Llama-3-70B-Instruct] | 2 |
31+
| [Phi-3] | x |
32+
| [deepseek-ai/DeepSeek-R1-Distill-Llama-70B] | 8 |
33+
| [deepseek-ai/DeepSeek-R1-Distill-Qwen-32B] | 4 |
1934

2035
## Support integrations
2136

@@ -166,9 +181,11 @@ curl http://${host_ip}:${TEXTGEN_PORT}/v1/chat/completions \
166181
<!--Below are links used in these document. They are not rendered: -->
167182

168183
[Intel/neural-chat-7b-v3-3]: https://huggingface.co/Intel/neural-chat-7b-v3-3
169-
[Llama-2-7b-chat-hf]: https://huggingface.co/meta-llama/Llama-2-7b-chat-hf
170-
[Llama-2-70b-chat-hf]: https://huggingface.co/meta-llama/Llama-2-70b-chat-hf
171-
[Meta-Llama-3-8B-Instruct]: https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct
172-
[Meta-Llama-3-70B-Instruct]: https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct
184+
[meta-llama/Llama-2-7b-chat-hf]: https://huggingface.co/meta-llama/Llama-2-7b-chat-hf
185+
[meta-llama/Llama-2-70b-chat-hf]: https://huggingface.co/meta-llama/Llama-2-70b-chat-hf
186+
[meta-llama/Meta-Llama-3-8B-Instruct]: https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct
187+
[meta-llama/Meta-Llama-3-70B-Instruct]: https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct
173188
[Phi-3]: https://huggingface.co/collections/microsoft/phi-3-6626e15e9585a200d2d761e3
174189
[HuggingFace]: https://huggingface.co/
190+
[deepseek-ai/DeepSeek-R1-Distill-Llama-70B]: https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-70B
191+
[deepseek-ai/DeepSeek-R1-Distill-Qwen-32B]: https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B

0 commit comments

Comments
 (0)