-
Notifications
You must be signed in to change notification settings - Fork 339
Adding files to deploy DocSum application on ROCm vLLM #1572
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
chensuyue
merged 149 commits into
opea-project:main
from
chyundunovDatamonsters:feature/DocSum_vLLM
Apr 3, 2025
Merged
Changes from all commits
Commits
Show all changes
149 commits
Select commit
Hold shift + click to select a range
cf60682
DocSum - add files for deploy app with ROCm vLLM
1fd1de1
DocSum - fix main
bd2d47e
DocSum - add files for deploy app with ROCm vLLM
2459ecb
DocSum - fix main
4d35065
Merge remote-tracking branch 'origin/main'
5b441e8
DocSum - fix files for deploy with ROCm vLLM
52c15cf
DocSum - fix files for deploy with ROCm vLLM
e578d3d
DocSum - fix files for deploy with ROCm vLLM
32075f0
DocSum - fix files for deploy with ROCm vLLM
ab627e5
DocSum - fix files for deploy with ROCm vLLM
7a9c041
DocSum - fix files for deploy with ROCm vLLM
4fb10b7
DocSum - fix files for deploy with ROCm vLLM
4652d88
DocSum - fix files for deploy with ROCm vLLM
75e9f02
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] ba1f2b1
DocSum - fix files for deploy with ROCm vLLM
ba56d73
DocSum - fix files for deploy with ROCm vLLM
5415478
DocSum - fix files for deploy with ROCm vLLM
25aa0d4
DocSum - fix files for deploy with ROCm vLLM
c1958bd
DocSum - fix files for deploy with ROCm vLLM
208c9f9
DocSum - fix files for deploy with ROCm vLLM
4958c39
DocSum - fix files for deploy with ROCm vLLM
605b332
DocSum - fix files for deploy with ROCm vLLM
02aaca3
DocSum - fix files for deploy with ROCm vLLM
ac678a2
DocSum - fix files for deploy with ROCm vLLM
14402dc
DocSum - fix files for deploy with ROCm vLLM
03bf8cb
DocSum - fix files for deploy on ROCm
ae708d6
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] ac60bd4
DocSum - fix files for deploy on ROCm
0787a6a
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 65c76af
DocSum - fix files for deploy on ROCm
614c6ce
DocSum - fix files for deploy on ROCm
2e92248
DocSum - fix files for deploy on ROCm
f3faa9d
DocSum - fix files for deploy on ROCm
172dda6
DocSum - fix files for deploy on ROCm
9a82629
DocSum - fix files for deploy on ROCm
bb42e69
DocSum - fix files for deploy on ROCm
e702cf1
DocSum - fix files for deploy on ROCm
d2d4725
Fix minor typo in README (#1559)
jotpalch 9c49538
Remove perf test code from test scripts. (#1510)
ZePan110 c6a0746
DocSum - add files for deploy app with ROCm vLLM
ef4182a
DocSum - fix main
135f912
DocSum - add files for deploy app with ROCm vLLM
b0eb7b8
DocSum - fix main
7750111
DocSum - fix files for deploy with ROCm vLLM
d8d3d2f
DocSum - fix files for deploy with ROCm vLLM
8cc16e3
DocSum - fix files for deploy with ROCm vLLM
6a97033
DocSum - fix files for deploy with ROCm vLLM
da022a8
DocSum - fix files for deploy with ROCm vLLM
795c0e9
DocSum - fix files for deploy with ROCm vLLM
c13c216
DocSum - fix files for deploy with ROCm vLLM
4c3d300
DocSum - fix files for deploy with ROCm vLLM
674ce6a
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] f8b2887
DocSum - fix files for deploy with ROCm vLLM
ea5002d
DocSum - fix files for deploy with ROCm vLLM
190f8de
DocSum - fix files for deploy with ROCm vLLM
7eb1ae9
DocSum - fix files for deploy with ROCm vLLM
5e24e8f
DocSum - fix files for deploy with ROCm vLLM
57f8c0c
DocSum - fix files for deploy with ROCm vLLM
a4f04ce
DocSum - fix files for deploy with ROCm vLLM
d25f642
DocSum - fix files for deploy with ROCm vLLM
07849cd
DocSum - fix files for deploy with ROCm vLLM
3736848
DocSum - fix files for deploy with ROCm vLLM
aff88f2
DocSum - fix files for deploy with ROCm vLLM
968d98f
DocSum - fix files for deploy on ROCm
2a814ad
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 1bd405d
DocSum - fix files for deploy on ROCm
3c8a2aa
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 0ad9f15
Bump gradio from 5.5.0 to 5.11.0 in /MultimodalQnA/ui/gradio (#1391)
dependabot[bot] 48f7d78
Simplify ChatQnA AIPC user setting (#1573)
xiguiw 2e08a5d
Fix mismatched environment variable (#1575)
xiguiw f7b0d31
Fix trivy issue (#1569)
ZePan110 ebc997e
Update AgentQnA and DocIndexRetriever (#1564)
minmin-intel 44e98a0
Update README.md of AIPC quick start (#1578)
yinghu5 bc84ddc
Fix "OpenAI" & "response" spelling (#1561)
eero-t af17b74
Bump gradio from 5.5.0 to 5.11.0 in /DocSum/ui/gradio (#1576)
dependabot[bot] 6b9e472
Align mongo related image names with comps (#1543)
Spycsh e860515
Fix ChatQnA ROCm compose Readme file and absolute path for ROCM CI te…
artem-astafev 6301d02
Fix async in chatqna bug (#1589)
XinyaoWa 5328fc2
Fix benchmark scripts (#1517)
chensuyue 4a528fd
Top level README: add link to github.io documentation (#1584)
alexsin368 1f42b35
fix click example button issue (#1586)
WenjiaoYue 136903b
ChatQnA Docker compose file for Milvus as vdb (#1548)
ezelanza 3f80f1b
Fix cd workflow condition (#1588)
chensuyue a7f269b
Update DBQnA tgi docker image to latest tgi 2.4.0 (#1593)
yinghu5 f190d02
Revert chatqna async and enhance tests (#1598)
Spycsh a62e9ec
Use model cache for docker compose test (#1582)
ZePan110 4d2a35c
open chatqna frontend test (#1594)
chensuyue 6a20d83
Enable CodeGen,CodeTrans and DocSum model cache for docker compose te…
ZePan110 25fcc53
bugfix GraphRAG updated docker compose and env settings to fix issues…
rbrugaro d70b4d7
Enable ChatQnA model cache for docker compose test. (#1605)
ZePan110 07d4c89
Enable SearchQnA model cache for docker compose test. (#1606)
ZePan110 849df16
Fix docker image opea/edgecraftrag security issue #1577 (#1617)
Yongbozzz 3673398
[AudioQnA] Fix the LLM model field for inputs alignment (#1611)
wangkl2 4d16ea3
Update compose.yaml for SearchQnA (#1622)
ZePan110 b374417
Update compose.yaml for ChatQnA (#1621)
ZePan110 0b1186b
Update compose.yaml (#1620)
ZePan110 cb86be2
Update compose.yaml (#1619)
ZePan110 c1a56f7
Enable vllm for CodeTrans (#1626)
letonghan f2d94ea
Update model cache for AgentQnA (#1627)
ZePan110 587e708
Use GenAIComp base image to simplify Dockerfiles (#1612)
eero-t 3763c94
[Bug: 112] Fix introduction in GenAIExamples main README (#1631)
srajabos 8311e9e
Fix corner CI issue when the example path deleted (#1634)
chensuyue 43432ea
[ChatQnA] Show spinner after query to improve user experience (#1003)…
wangleflex 8402fff
Use the latest HabanaAI/vllm-fork release tag to build vllm-gaudi ima…
chensuyue 3698827
Set vLLM as default model for FaqGen (#1580)
XinyaoWa 304d835
Fix vllm model cache directory (#1642)
wangkl2 f6c3f7b
Enhance ChatQnA test scripts (#1643)
chensuyue 9407fe2
Add GitHub Action to check and close stale issues and PRs (#1646)
XuehaoSun 6c3000e
Use GenAIComp base image to simplify Dockerfiles & reduce image sizes…
eero-t 88b1364
Enable inject_commit to docker image feature. (#1653)
ZePan110 c7c85d9
Enable CodeGen vLLM (#1636)
xiguiw 02fd196
[ChatQnA][docker]Check healthy of redis to avoid dataprep failure (#1…
gavinlichn 383a67e
Enable GraphRAG and ProductivitySuite model cache for docker compose …
ZePan110 666e7af
Enable Gaudi3, Rocm and Arc on manually release test. (#1615)
ZePan110 2a9dfcd
Refine README with highlighted examples and updated support info (#1006)
CharleneHu-42 3c95214
[AudioQnA] Enable vLLM and set it as default LLM serving (#1657)
wangkl2 398005c
[ChatQnA] Enable Prometheus and Grafana with telemetry docker compos…
louie-tsai cb41f8a
Update stale issue and PR settings to 30 days for inactivity (#1661)
XuehaoSun 8854d2c
Add final README.md and set_env.sh script for quickstart review. Prev…
jedwards-habana e2b4f20
Fix input issue for manual-image-build.yml (#1666)
chensuyue 6c67245
Set vLLM as default model for VisualQnA (#1644)
Spycsh e01fade
Fix workflow issues. (#1691)
ZePan110 b9cb0c6
Enable base image build in CI/CD (#1669)
chensuyue a6e3b54
fix errors for running AgentQnA on xeon with openai and update readme…
minmin-intel 0fec748
Add new UI/new features for EC-RAG (#1665)
Yongbozzz 71a8791
Merge FaqGen into ChatQnA (#1654)
XinyaoWa d18f4c4
DocSum - fix files for deploy on ROCm vLLM
84932a7
DocSum - fix files for deploy on ROCm vLLM
d85d3a9
DocSum - fix files for deploy on ROCm vLLM
0cb164b
DocSum - fix files for deploy on ROCm vLLM
28e68ae
Merge branch 'main' of https://github.com/opea-project/GenAIExamples …
9d3595b
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 2392346
DocSum - fix files for deploy on ROCm vLLM
b4527e4
Merge remote-tracking branch 'origin/feature/DocSum_vLLM' into featur…
42fdcbf
DocSum - fix files for deploy on ROCm vLLM
f5e0196
DocSum - fix files for deploy on ROCm vLLM
4aef21b
Merge branch 'main' into feature/DocSum_vLLM
artem-astafev 3ad714a
Merge branch 'feature/Docsum_vLLM' of https://github.com/chyundunovDa…
aa15e3e
Merge branch 'main' of https://github.com/opea-project/GenAIExamples …
08a4540
DocSum - fix files for deploy on ROCm vLLM
183bb79
DocSum - fix files for deploy on ROCm vLLM
d4f8070
Merge branch 'main' of https://github.com/opea-project/GenAIExamples …
3d64cb4
Merge branch 'main' into feature/DocSum_vLLM
chyundunovDatamonsters 5f65d79
DocSum - fix files for deploy on ROCm vLLM
2232e4f
Merge remote-tracking branch 'origin/feature/DocSum_vLLM' into featur…
f8f6baa
DocSum - fix files for deploy on ROCm vLLM
3a1bf7f
Merge branch 'main' of https://github.com/opea-project/GenAIExamples …
16320de
DocSum - fix files for deploy on ROCm vLLM
d056c51
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,111 @@ | ||
| # Copyright (C) 2024 Advanced Micro Devices, Inc. | ||
| # SPDX-License-Identifier: Apache-2.0 | ||
|
|
||
| services: | ||
| docsum-vllm-service: | ||
| image: ${REGISTRY:-opea}/vllm-rocm:${TAG:-latest} | ||
| container_name: docsum-vllm-service | ||
| ports: | ||
| - "${DOCSUM_VLLM_SERVICE_PORT:-8081}:8011" | ||
| environment: | ||
| no_proxy: ${no_proxy} | ||
| http_proxy: ${http_proxy} | ||
| https_proxy: ${https_proxy} | ||
| HUGGINGFACEHUB_API_TOKEN: ${DOCSUM_HUGGINGFACEHUB_API_TOKEN} | ||
| HF_TOKEN: ${DOCSUM_HUGGINGFACEHUB_API_TOKEN} | ||
| HF_HUB_DISABLE_PROGRESS_BARS: 1 | ||
| HF_HUB_ENABLE_HF_TRANSFER: 0 | ||
| VLLM_USE_TRITON_FLASH_ATTENTION: 0 | ||
| PYTORCH_JIT: 0 | ||
| healthcheck: | ||
| test: [ "CMD-SHELL", "curl -f http://${HOST_IP}:${DOCSUM_VLLM_SERVICE_PORT:-8081}/health || exit 1" ] | ||
| interval: 10s | ||
| timeout: 10s | ||
| retries: 100 | ||
| volumes: | ||
| - "${MODEL_CACHE:-./data}:/data" | ||
| shm_size: 20G | ||
| devices: | ||
| - /dev/kfd:/dev/kfd | ||
| - /dev/dri/:/dev/dri/ | ||
| cap_add: | ||
| - SYS_PTRACE | ||
| group_add: | ||
| - video | ||
| security_opt: | ||
| - seccomp:unconfined | ||
| - apparmor=unconfined | ||
| command: "--model ${DOCSUM_LLM_MODEL_ID} --swap-space 16 --disable-log-requests --dtype float16 --tensor-parallel-size 4 --host 0.0.0.0 --port 8011 --num-scheduler-steps 1 --distributed-executor-backend \"mp\"" | ||
| ipc: host | ||
|
|
||
| docsum-llm-server: | ||
| image: ${REGISTRY:-opea}/llm-docsum:${TAG:-latest} | ||
| container_name: docsum-llm-server | ||
| depends_on: | ||
| docsum-vllm-service: | ||
| condition: service_healthy | ||
| ports: | ||
| - "${DOCSUM_LLM_SERVER_PORT}:9000" | ||
| ipc: host | ||
| environment: | ||
| no_proxy: ${no_proxy} | ||
| http_proxy: ${http_proxy} | ||
| https_proxy: ${https_proxy} | ||
| LLM_ENDPOINT: ${DOCSUM_LLM_ENDPOINT} | ||
| HUGGINGFACEHUB_API_TOKEN: ${DOCSUM_HUGGINGFACEHUB_API_TOKEN} | ||
| MAX_INPUT_TOKENS: ${DOCSUM_MAX_INPUT_TOKENS} | ||
| MAX_TOTAL_TOKENS: ${DOCSUM_MAX_TOTAL_TOKENS} | ||
| LLM_MODEL_ID: ${DOCSUM_LLM_MODEL_ID} | ||
| DocSum_COMPONENT_NAME: "OpeaDocSumvLLM" | ||
| LOGFLAG: ${LOGFLAG:-False} | ||
| restart: unless-stopped | ||
|
|
||
| whisper: | ||
| image: ${REGISTRY:-opea}/whisper:${TAG:-latest} | ||
| container_name: whisper-service | ||
| ports: | ||
| - "${DOCSUM_WHISPER_PORT:-7066}:7066" | ||
| ipc: host | ||
| environment: | ||
| no_proxy: ${no_proxy} | ||
| http_proxy: ${http_proxy} | ||
| https_proxy: ${https_proxy} | ||
| restart: unless-stopped | ||
|
|
||
| docsum-backend-server: | ||
| image: ${REGISTRY:-opea}/docsum:${TAG:-latest} | ||
| container_name: docsum-backend-server | ||
| depends_on: | ||
| - docsum-vllm-service | ||
| - docsum-llm-server | ||
| ports: | ||
| - "${DOCSUM_BACKEND_SERVER_PORT}:8888" | ||
| environment: | ||
| no_proxy: ${no_proxy} | ||
| https_proxy: ${https_proxy} | ||
| http_proxy: ${http_proxy} | ||
| MEGA_SERVICE_HOST_IP: ${HOST_IP} | ||
| LLM_SERVICE_HOST_IP: ${HOST_IP} | ||
| ASR_SERVICE_HOST_IP: ${ASR_SERVICE_HOST_IP} | ||
| ipc: host | ||
| restart: always | ||
|
|
||
| docsum-gradio-ui: | ||
| image: ${REGISTRY:-opea}/docsum-gradio-ui:${TAG:-latest} | ||
| container_name: docsum-ui-server | ||
| depends_on: | ||
| - docsum-backend-server | ||
| ports: | ||
| - "${DOCSUM_FRONTEND_PORT:-5173}:5173" | ||
| environment: | ||
| no_proxy: ${no_proxy} | ||
| https_proxy: ${https_proxy} | ||
| http_proxy: ${http_proxy} | ||
| BACKEND_SERVICE_ENDPOINT: ${BACKEND_SERVICE_ENDPOINT} | ||
| DOC_BASE_URL: ${BACKEND_SERVICE_ENDPOINT} | ||
| ipc: host | ||
| restart: always | ||
|
|
||
| networks: | ||
| default: | ||
| driver: bridge |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,18 @@ | ||
| #!/usr/bin/env bash | ||
|
|
||
| # Copyright (C) 2024 Advanced Micro Devices, Inc. | ||
| # SPDX-License-Identifier: Apache-2.0 | ||
|
|
||
| export HOST_IP='' | ||
| export DOCSUM_HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN} | ||
| export DOCSUM_MAX_INPUT_TOKENS=2048 | ||
| export DOCSUM_MAX_TOTAL_TOKENS=4096 | ||
| export DOCSUM_LLM_MODEL_ID="Intel/neural-chat-7b-v3-3" | ||
| export DOCSUM_VLLM_SERVICE_PORT="8008" | ||
| export DOCSUM_LLM_ENDPOINT="http://${HOST_IP}:${DOCSUM_VLLM_SERVICE_PORT}" | ||
| export DOCSUM_WHISPER_PORT="7066" | ||
| export ASR_SERVICE_HOST_IP="${HOST_IP}" | ||
| export DOCSUM_LLM_SERVER_PORT="9000" | ||
| export DOCSUM_BACKEND_SERVER_PORT="18072" | ||
| export DOCSUM_FRONTEND_PORT="18073" | ||
| export BACKEND_SERVICE_ENDPOINT="http://${HOST_IP}:${DOCSUM_BACKEND_SERVER_PORT}/v1/docsum" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.