-
Notifications
You must be signed in to change notification settings - Fork 216
Add image2video microservice (Stable Video Diffusion) #465
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 5 commits
Commits
Show all changes
18 commits
Select commit
Hold shift + click to select a range
a9d1c47
added image2video microservice.
XinyuYe-Intel 93264ec
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 72974e1
addition changes.
XinyuYe-Intel b9b9833
minor changes
XinyuYe-Intel 46ba650
Merge branch 'main' into xinyuye/sd
XinyuYe-Intel 0919145
Merge branch 'main' into xinyuye/sd
chensuyue 209aea7
added ut test
XinyuYe-Intel 7019af1
Merge branch 'main' into xinyuye/sd
XinyuYe-Intel 0144daf
Merge branch 'main' into xinyuye/sd
XinyuYe-Intel ded30f2
Merge branch 'main' into xinyuye/sd
lvliang-intel 897877c
Merge branch 'main' into xinyuye/sd
kevinintel 4018668
unified path.
XinyuYe-Intel 105b2d8
added gaudi support for svd.
XinyuYe-Intel 5ed0acf
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] bdee37e
fix bug
XinyuYe-Intel 5800be4
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] f81d9e0
fix ut
XinyuYe-Intel abc4bea
add docker image release file
XinyuYe-Intel File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -18,6 +18,9 @@ | |
| RAGASScores, | ||
| GraphDoc, | ||
| LVMDoc, | ||
| ImagePath, | ||
| ImagesPath, | ||
| VideoPath, | ||
| ) | ||
|
|
||
| # Constants | ||
|
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,18 @@ | ||
| # Copyright (C) 2024 Intel Corporation | ||
| # SPDX-License-Identifier: Apache-2.0 | ||
|
|
||
| FROM python:3.11-slim | ||
|
|
||
| # Set environment variables | ||
| ENV LANG=en_US.UTF-8 | ||
|
|
||
| COPY comps /home/comps | ||
|
|
||
| RUN pip install --no-cache-dir --upgrade pip && \ | ||
| pip install --no-cache-dir -r /home/comps/image2video/requirements.txt | ||
|
|
||
| ENV PYTHONPATH=$PYTHONPATH:/home | ||
|
|
||
| WORKDIR /home/comps/image2video | ||
|
|
||
| ENTRYPOINT ["python", "image2video.py"] | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,66 @@ | ||
| # Image-to-Video Microservice | ||
|
|
||
| Image-to-Video is a task that generate video conditioning on the provided image(s). This microservice supports image-to-video task by using Stable Video Diffusion (SVD) model. | ||
|
|
||
| # 🚀1. Start Microservice with Python (Option 1) | ||
|
|
||
| ## 1.1 Install Requirements | ||
|
|
||
| ```bash | ||
| pip install -r requirements.txt | ||
| pip install -r svd/requirements.txt | ||
| ``` | ||
|
|
||
| ## 1.2 Start SVD Service | ||
|
|
||
| ```bash | ||
| # Start SVD service | ||
| cd svd/ | ||
| python svd_server.py | ||
| ``` | ||
|
|
||
| ## 1.3 Start Image-to-Video Microservice | ||
|
|
||
| ```bash | ||
| cd .. | ||
| # Start the OPEA Microservice | ||
| python image2video.py | ||
| ``` | ||
|
|
||
| # 🚀2. Start Microservice with Docker (Option 2) | ||
|
|
||
| ## 2.1 Build Images | ||
|
|
||
| ### 2.1.1 SVD Server Image | ||
|
|
||
| ```bash | ||
| cd ../.. | ||
| docker build -t opea/svd:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/image2video/svd/Dockerfile . | ||
| ``` | ||
|
|
||
| ### 2.1.2 Image-to-Video Service Image | ||
|
|
||
| ```bash | ||
| docker build -t opea/image2video:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/image2video/Dockerfile . | ||
| ``` | ||
|
|
||
| ## 2.2 Start SVD and Image-to-Video Service | ||
|
|
||
| ### 2.2.1 Start SVD server | ||
|
|
||
| ```bash | ||
| docker run --ipc=host -p 9368:9368 -e http_proxy=$http_proxy -e https_proxy=$https_proxy opea/svd:latest | ||
| ``` | ||
|
|
||
| ### 2.2.2 Start Image-to-Video service | ||
|
|
||
| ```bash | ||
| ip_address=$(hostname -I | awk '{print $1}') | ||
| docker run -p 9369:9369 --ipc=host -e http_proxy=$http_proxy -e https_proxy=$https_proxy -e SVD_ENDPOINT=http://$ip_address:9368 opea/image2video:latest | ||
| ``` | ||
|
|
||
| ### 2.2.3 Test | ||
|
|
||
| ```bash | ||
| http_proxy="" curl http://localhost:9369/v1/image2video -XPOST -d '{"images_path":[{"image_path":"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/svd/rocket.png"}]}' -H 'Content-Type: application/json' | ||
| ``` |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,2 @@ | ||
| # Copyright (C) 2024 Intel Corporation | ||
| # SPDX-License-Identifier: Apache-2.0 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,47 @@ | ||
| # Copyright (C) 2024 Intel Corporation | ||
| # SPDX-License-Identifier: Apache-2.0 | ||
|
|
||
|
|
||
| import json | ||
| import os | ||
| import time | ||
|
|
||
| import requests | ||
|
|
||
| from comps import ( | ||
| ImagesPath, | ||
| ServiceType, | ||
| VideoPath, | ||
| opea_microservices, | ||
| register_microservice, | ||
| register_statistics, | ||
| statistics_dict, | ||
| ) | ||
|
|
||
|
|
||
| @register_microservice( | ||
| name="opea_service@image2video", | ||
| service_type=ServiceType.IMAGE2VIDEO, | ||
| endpoint="/v1/image2video", | ||
| host="0.0.0.0", | ||
| port=9369, | ||
| input_datatype=ImagesPath, | ||
| output_datatype=VideoPath, | ||
| ) | ||
| @register_statistics(names=["opea_service@image2video"]) | ||
| async def image2video(input: ImagesPath): | ||
| start = time.time() | ||
| images_path = [img.image_path for img in input.images_path] | ||
| inputs = {"images_path": images_path} | ||
| video_path = requests.post(url=f"{svd_endpoint}/generate", data=json.dumps(inputs), proxies={"http": None}).json()[ | ||
| "video_path" | ||
| ] | ||
|
|
||
| statistics_dict["opea_service@image2video"].append_latency(time.time() - start, None) | ||
| return VideoPath(video_path=video_path) | ||
|
|
||
|
|
||
| if __name__ == "__main__": | ||
| svd_endpoint = os.getenv("SVD_ENDPOINT", "http://localhost:9368") | ||
| print("Image2video server started.") | ||
| opea_microservices["opea_service@image2video"].start() |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,11 @@ | ||
| datasets | ||
| docarray[full] | ||
| fastapi | ||
| opentelemetry-api | ||
| opentelemetry-exporter-otlp | ||
| opentelemetry-sdk | ||
| prometheus-fastapi-instrumentator | ||
| pydantic==2.7.2 | ||
| pydub | ||
| shortuuid | ||
| uvicorn |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,22 @@ | ||
| # Copyright (C) 2024 Intel Corporation | ||
| # SPDX-License-Identifier: Apache-2.0 | ||
|
|
||
| FROM python:3.11-slim | ||
|
|
||
| # Set environment variables | ||
| ENV LANG=en_US.UTF-8 | ||
|
|
||
| ARG ARCH="cpu" | ||
|
|
||
| COPY comps /home/comps | ||
|
|
||
| RUN apt-get update && apt-get install python3-opencv -y && \ | ||
| pip install --no-cache-dir --upgrade pip && \ | ||
| if [ ${ARCH} = "cpu" ]; then pip install torch torchvision --index-url https://download.pytorch.org/whl/cpu; fi && \ | ||
| pip install --no-cache-dir -r /home/comps/image2video/svd/requirements.txt | ||
|
|
||
| ENV PYTHONPATH=$PYTHONPATH:/home | ||
|
|
||
| WORKDIR /home/comps/image2video/svd | ||
|
|
||
| ENTRYPOINT ["python", "svd_server.py"] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,7 @@ | ||
| accelerate | ||
| diffusers | ||
| fastapi | ||
| opencv-python | ||
| torch | ||
| transformers | ||
| uvicorn |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,55 @@ | ||
| # Copyright (C) 2024 Intel Corporation | ||
| # SPDX-License-Identifier: Apache-2.0 | ||
| """Stand-alone Stable Video Diffusion FastAPI Server.""" | ||
|
|
||
| import argparse | ||
| import os | ||
| import time | ||
|
|
||
| import torch | ||
| import uvicorn | ||
| from diffusers import StableVideoDiffusionPipeline | ||
| from diffusers.utils import export_to_video, load_image | ||
| from fastapi import FastAPI, Request | ||
| from fastapi.responses import JSONResponse, Response | ||
|
|
||
| app = FastAPI() | ||
|
|
||
|
|
||
| @app.post("/generate") | ||
XinyuYe-Intel marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| async def generate(request: Request) -> Response: | ||
| print("SVD generation begin.") | ||
| request_dict = await request.json() | ||
| images_path = request_dict.pop("images_path") | ||
|
|
||
| start = time.time() | ||
| images = [load_image(img) for img in images_path] | ||
| images = [image.resize((1024, 576)) for image in images] | ||
|
|
||
| generator = torch.manual_seed(args.seed) | ||
| frames = pipe(images, decode_chunk_size=8, generator=generator).frames[0] | ||
| video_path = os.path.join(os.getcwd(), args.video_path) | ||
| export_to_video(frames, video_path, fps=7) | ||
| end = time.time() | ||
| print(f"SVD video output in {video_path}, time = {end-start}s") | ||
| return JSONResponse({"video_path": video_path}) | ||
|
|
||
|
|
||
| if __name__ == "__main__": | ||
| parser = argparse.ArgumentParser() | ||
| parser.add_argument("--host", type=str, default="0.0.0.0") | ||
| parser.add_argument("--port", type=int, default=9368) | ||
| parser.add_argument("--model_name_or_path", type=str, default="stabilityai/stable-video-diffusion-img2vid-xt") | ||
| parser.add_argument("--video_path", type=str, default="generated.mp4") | ||
| parser.add_argument("--seed", type=int, default=42) | ||
|
|
||
| args = parser.parse_args() | ||
| pipe = StableVideoDiffusionPipeline.from_pretrained(args.model_name_or_path) | ||
| print("Stable Video Diffusion model initialized.") | ||
|
|
||
| uvicorn.run( | ||
| app, | ||
| host=args.host, | ||
| port=args.port, | ||
| log_level="debug", | ||
| ) | ||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.