Skip to content
Merged
Show file tree
Hide file tree
Changes from 10 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 11 additions & 3 deletions MultimodalQnA/README.md
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please view this README in a separate tab. Let me know if I should fix the screenshot sizes

Comment thread
mhbuehler marked this conversation as resolved.
Original file line number Diff line number Diff line change
Expand Up @@ -172,8 +172,16 @@ docker compose -f compose.yaml up -d

## MultimodalQnA Demo on Gaudi2

![MultimodalQnA-upload-waiting-screenshot](./assets/img/upload-gen-trans.png)
![MultimodalQnA-ui-screenshot](./assets/img/mmqna-ui.png)

![MultimodalQnA-upload-done-screenshot](./assets/img/upload-gen-captions.png)
![MultimodalQnA-ingest-video-screenshot](./assets/img/video-ingestion.png)

![MultimodalQnA-query-example-screenshot](./assets/img/example_query.png)
![MultimodalQnA-video-query-screenshot](./assets/img/video-query.png)

![MultimodalQnA-audio-ingestion-screenshot](./assets/img/audio-ingestion.png)

![MultimodalQnA-audio-query-screenshot](./assets/img/audio-query.png)

![MultimodalQnA-upload-pdf-screenshot](./assets/img/ingest_pdf.png)

![MultimodalQnA-pdf-query-example-screenshot](./assets/img/pdf-query.png)
Comment thread
mhbuehler marked this conversation as resolved.
Binary file added MultimodalQnA/assets/img/audio-ingestion.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added MultimodalQnA/assets/img/audio-query.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added MultimodalQnA/assets/img/ingest_pdf.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added MultimodalQnA/assets/img/mmqna-ui.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added MultimodalQnA/assets/img/pdf-query.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added MultimodalQnA/assets/img/video-ingestion.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added MultimodalQnA/assets/img/video-query.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
14 changes: 8 additions & 6 deletions MultimodalQnA/docker_compose/intel/cpu/xeon/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,10 @@ lvm
===
Port 9399 - Open to 0.0.0.0/0

whisper
===
port 7066 - Open to 0.0.0.0/0

dataprep-multimodal-redis
===
Port 6007 - Open to 0.0.0.0/0
Expand Down Expand Up @@ -83,8 +87,6 @@ export WHISPER_PORT=7066
export WHISPER_SERVER_ENDPOINT="http://${host_ip}:${WHISPER_PORT}/v1/asr"
export WHISPER_MODEL="base"
export MAX_IMAGES=1
export ASR_ENDPOINT=http://$host_ip:$WHISPER_PORT
export ASR_PORT=9099
export ASR_SERVICE_PORT=3001
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are ASR_SERVICE_PORT and ASR_SERVICE_ENDPOINT still needed?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left them in just in case but maybe not

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see these vars on main, so we'd be adding them. If they aren't used, we should remove them

export ASR_SERVICE_ENDPOINT="http://${host_ip}:${ASR_SERVICE_PORT}/v1/audio/transcriptions"
export REDIS_DB_PORT=6379
Comment thread
okhleif-10 marked this conversation as resolved.
Expand Down Expand Up @@ -164,7 +166,7 @@ docker build --no-cache -t opea/lvm:latest --build-arg https_proxy=$https_proxy
docker build --no-cache -t opea/dataprep-multimodal-redis:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/dataprep/multimodal/redis/langchain/Dockerfile .
```

### 5. Build asr images
### 5. Build Whisper Server Image

Build whisper server image

Expand Down Expand Up @@ -270,7 +272,7 @@ curl http://${host_ip}:${REDIS_RETRIEVER_PORT}/v1/multimodal_retrieval \
-d "{\"text\":\"test\",\"embedding\":${your_embedding}}"
```

4. asr
4. whisper

```bash
curl ${WHISPER_SERVER_ENDPOINT} \
Expand Down Expand Up @@ -406,15 +408,15 @@ curl http://${host_ip}:${MEGA_SERVICE_PORT}/v1/multimodalqna \
Test the MegaService with an audio query:

```bash
curl http://${host_ip}:8888/v1/multimodalqna \
curl http://${host_ip}:${MEGA_SERVICE_PORT}/v1/multimodalqna \
-H "Content-Type: application/json" \
-d '{"messages": [{"role": "user", "content": [{"type": "audio", "audio": "UklGRigAAABXQVZFZm10IBIAAAABAAEARKwAAIhYAQACABAAAABkYXRhAgAAAAEA"}]}]}'
```

Test the MegaService with a text and image query:

```bash
curl http://${host_ip}:8888/v1/multimodalqna \
curl http://${host_ip}:${MEGA_SERVICE_PORT}/v1/multimodalqna \
-H "Content-Type: application/json" \
-d '{"messages": [{"role": "user", "content": [{"type": "text", "text": "Green bananas in a tree"}, {"type": "image_url", "image_url": {"url": "http://images.cocodataset.org/test-stuff2017/000000004248.jpg"}}]}]}'
```
Expand Down
13 changes: 0 additions & 13 deletions MultimodalQnA/docker_compose/intel/cpu/xeon/compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -14,19 +14,6 @@ services:
https_proxy: ${https_proxy}
WHISPER_PORT: ${WHISPER_PORT}
restart: unless-stopped
asr:
image: ${REGISTRY:-opea}/asr:${TAG:-latest}
container_name: asr-service
ports:
- "${ASR_SERVICE_PORT}:${ASR_PORT}"
ipc: host
environment:
WHISPER_PORT: ${WHISPER_PORT}
MAX_IMAGES: ${MAX_IMAGES}
ASR_PORT: ${ASR_PORT}
ASR_ENDPOINT: ${ASR_ENDPOINT}
ASR_SERVICE_PORT: ${ASR_SERVICE_PORT}
ASR_SERVICE_ENDPOINT: ${ASR_SERVICE_ENDPOINT}
redis-vector-db:
image: redis/redis-stack:7.2.0-v9
container_name: redis-vector-db
Expand Down
2 changes: 0 additions & 2 deletions MultimodalQnA/docker_compose/intel/cpu/xeon/set_env.sh
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,6 @@ export WHISPER_PORT=7066
export WHISPER_SERVER_ENDPOINT="http://${host_ip}:${WHISPER_PORT}/v1/asr"
export WHISPER_MODEL="base"
export MAX_IMAGES=1
export ASR_ENDPOINT=http://$host_ip:$WHISPER_PORT
export ASR_PORT=9099
export ASR_SERVICE_PORT=3001
export ASR_SERVICE_ENDPOINT="http://${host_ip}:${ASR_SERVICE_PORT}/v1/audio/transcriptions"

Expand Down
10 changes: 4 additions & 6 deletions MultimodalQnA/docker_compose/intel/hpu/gaudi/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,8 +37,6 @@ export WHISPER_PORT=7066
export WHISPER_SERVER_ENDPOINT="http://${host_ip}:${WHISPER_PORT}/v1/asr"
export MAX_IMAGES=1
export WHISPER_MODEL="base"
export ASR_ENDPOINT=http://$host_ip:$WHISPER_PORT
export ASR_PORT=9099
export ASR_SERVICE_PORT=3001
Comment thread
dmsuehir marked this conversation as resolved.
Outdated
export ASR_SERVICE_ENDPOINT="http://${host_ip}:${ASR_SERVICE_PORT}/v1/audio/transcriptions"
export DATAPREP_MMR_PORT=6007
Comment thread
okhleif-10 marked this conversation as resolved.
Expand Down Expand Up @@ -116,7 +114,7 @@ docker build --no-cache -t opea/lvm:latest --build-arg https_proxy=$https_proxy
docker build --no-cache -t opea/dataprep-multimodal-redis:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/dataprep/multimodal/redis/langchain/Dockerfile .
```

### 5. Build asr images
### 5. Build Whisper Server Image

Build whisper server image

Expand Down Expand Up @@ -220,7 +218,7 @@ curl http://${host_ip}:7000/v1/multimodal_retrieval \
-d "{\"text\":\"test\",\"embedding\":${your_embedding}}"
```

4. asr
4. whisper

```bash
curl ${WHISPER_SERVER_ENDPOINT} \
Expand Down Expand Up @@ -356,15 +354,15 @@ curl http://${host_ip}:${MEGA_SERVICE_PORT}/v1/multimodalqna \
Test the MegaService with an audio query:

```bash
curl http://${host_ip}:8888/v1/multimodalqna \
curl http://${host_ip}:${MEGA_SERVICE_PORT}/v1/multimodalqna \
-H "Content-Type: application/json" \
-d '{"messages": [{"role": "user", "content": [{"type": "audio", "audio": "UklGRigAAABXQVZFZm10IBIAAAABAAEARKwAAIhYAQACABAAAABkYXRhAgAAAAEA"}]}]}'
```

Test the MegaService with a text and image query:

```bash
curl http://${host_ip}:8888/v1/multimodalqna \
curl http://${host_ip}:${MEGA_SERVICE_PORT}/v1/multimodalqna \
-H "Content-Type: application/json" \
-d '{"messages": [{"role": "user", "content": [{"type": "text", "text": "Green bananas in a tree"}, {"type": "image_url", "image_url": {"url": "http://images.cocodataset.org/test-stuff2017/000000004248.jpg"}}]}]}'
```
Expand Down
13 changes: 0 additions & 13 deletions MultimodalQnA/docker_compose/intel/hpu/gaudi/compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -21,19 +21,6 @@ services:
WHISPER_PORT: ${WHISPER_PORT}
WHISPER_SERVER_ENDPOINT: ${WHISPER_SERVER_ENDPOINT}
restart: unless-stopped
asr:
image: ${REGISTRY:-opea}/asr:${TAG:-latest}
container_name: asr-service
ports:
- "${ASR_SERVICE_PORT}:${ASR_PORT}"
ipc: host
environment:
WHISPER_PORT: ${WHISPER_PORT}
MAX_IMAGES: ${MAX_IMAGES}
ASR_PORT: ${ASR_PORT}
ASR_ENDPOINT: ${ASR_ENDPOINT}
ASR_SERVICE_PORT: ${ASR_SERVICE_PORT}
ASR_SERVICE_ENDPOINT: ${ASR_SERVICE_ENDPOINT}
dataprep-multimodal-redis:
image: ${REGISTRY:-opea}/dataprep-multimodal-redis:${TAG:-latest}
container_name: dataprep-multimodal-redis
Expand Down
4 changes: 1 addition & 3 deletions MultimodalQnA/docker_compose/intel/hpu/gaudi/set_env.sh
Original file line number Diff line number Diff line change
Expand Up @@ -23,12 +23,10 @@ export REDIS_URL="redis://${host_ip}:${REDIS_DB_PORT}"
export REDIS_HOST=${host_ip}
export INDEX_NAME="mm-rag-redis"

export WHISPER_MODEL="base"
export WHISPER_PORT=7066
export WHISPER_SERVER_ENDPOINT="http://${host_ip}:${WHISPER_PORT}/v1/asr"
export MAX_IMAGES=1
export WHISPER_MODEL="base"
export ASR_ENDPOINT=http://$host_ip:$WHISPER_PORT
export ASR_PORT=9099
export ASR_SERVICE_PORT=3001
export ASR_SERVICE_ENDPOINT="http://${host_ip}:${ASR_SERVICE_PORT}/v1/audio/transcriptions"

Expand Down
6 changes: 0 additions & 6 deletions MultimodalQnA/docker_image_build/build.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -59,9 +59,3 @@ services:
dockerfile: comps/asr/src/integrations/dependency/whisper/Dockerfile
extends: multimodalqna
image: ${REGISTRY:-opea}/whisper:${TAG:-latest}
asr:
build:
context: GenAIComps
dockerfile: comps/asr/src/Dockerfile
extends: multimodalqna
image: ${REGISTRY:-opea}/asr:${TAG:-latest}
4 changes: 1 addition & 3 deletions MultimodalQnA/tests/test_compose_on_gaudi.sh
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ function build_docker_images() {
cd $WORKPATH/docker_image_build
git clone https://github.com/mhbuehler/GenAIComps.git && cd GenAIComps && git checkout "${opea_branch:-"mmqna-image-query"}" && cd ../
echo "Build all the images with --no-cache, check docker_image_build.log for details..."
service_list="multimodalqna multimodalqna-ui embedding-multimodal-bridgetower embedding retriever-redis lvm dataprep-multimodal-redis asr whisper"
service_list="multimodalqna multimodalqna-ui embedding-multimodal-bridgetower embedding retriever-redis lvm dataprep-multimodal-redis whisper"
docker compose -f build.yaml build ${service_list} --no-cache > ${LOG_PATH}/docker_image_build.log

docker pull ghcr.io/huggingface/tgi-gaudi:2.0.6
Expand All @@ -68,8 +68,6 @@ function setup_env() {
export WHISPER_PORT=7066
export MAX_IMAGES=1
export WHISPER_MODEL="base"
export ASR_ENDPOINT=http://$host_ip:$WHISPER_PORT
export ASR_PORT=9099
export ASR_SERVICE_PORT=3001
export ASR_SERVICE_ENDPOINT="http://${host_ip}:${ASR_SERVICE_PORT}/v1/audio/transcriptions"
export DATAPREP_MMR_PORT=6007
Expand Down
2 changes: 1 addition & 1 deletion MultimodalQnA/tests/test_compose_on_rocm.sh
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ function build_docker_images() {
git clone https://github.com/opea-project/GenAIComps.git && cd GenAIComps && git checkout "${opea_branch:-"main"}" && cd ../

echo "Build all the images with --no-cache, check docker_image_build.log for details..."
service_list="multimodalqna multimodalqna-ui embedding-multimodal-bridgetower embedding retriever-redis lvm dataprep-multimodal-redis asr whisper"
service_list="multimodalqna multimodalqna-ui embedding-multimodal-bridgetower embedding retriever-redis lvm dataprep-multimodal-redis whisper"
docker compose -f build.yaml build ${service_list} --no-cache > ${LOG_PATH}/docker_image_build.log

docker images && sleep 1m
Expand Down
4 changes: 1 addition & 3 deletions MultimodalQnA/tests/test_compose_on_xeon.sh
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ function build_docker_images() {
cd $WORKPATH/docker_image_build
git clone https://github.com/mhbuehler/GenAIComps.git && cd GenAIComps && git checkout "${opea_branch:-"mmqna-image-query"}" && cd ../
echo "Build all the images with --no-cache, check docker_image_build.log for details..."
service_list="multimodalqna multimodalqna-ui embedding-multimodal-bridgetower embedding retriever-redis lvm-llava lvm dataprep-multimodal-redis asr whisper"
service_list="multimodalqna multimodalqna-ui embedding-multimodal-bridgetower embedding retriever-redis lvm-llava lvm dataprep-multimodal-redis whisper"
docker compose -f build.yaml build ${service_list} --no-cache > ${LOG_PATH}/docker_image_build.log

docker images && sleep 1m
Expand All @@ -61,8 +61,6 @@ function setup_env() {
export WHISPER_PORT=7066
export MAX_IMAGES=1
export WHISPER_MODEL="base"
export ASR_ENDPOINT=http://$host_ip:$WHISPER_PORT
export ASR_PORT=9099
export ASR_SERVICE_PORT=3001
export ASR_SERVICE_ENDPOINT="http://${host_ip}:${ASR_SERVICE_PORT}/v1/audio/transcriptions"
export REDIS_DB_PORT=6379
Expand Down