Skip to content

ruffiann/SemVC

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 

Repository files navigation

Semantic Video Composition via Pre-trained Diffusion Model

paper

This repository is a brief introduction and case study of the paper "Training-Free Semantic Video Composition via Pre-trained Diffusion Model" in ICME 2024 (Oral).

😁Introduction

The video composition task aims to integrate specified foregrounds and backgrounds from different videos into a harmonious composite. Current approaches, predominantly trained on videos with adjusted foreground color and lighting, struggle to address deep semantic disparities beyond superficial adjustments, such as domain gaps. Therefore, we propose a training-free pipeline employing a pre-trained diffusion model imbued with semantic prior knowledge, which can process composite videos with broader semantic disparities.

🤯Method

Using the pretrained Stable Diffusion V2-1 as our backbone, we leverage its robust semantic understanding capabilities to propose an training-free video compositing pipeline.

🔆Comparison

input TF-ICON Ours
ori_bear_autumn.gif tficon_bear_autumn.gif ours_bear_autumn.gif
ori_blackswan_willow.gif tficon_blackswan_willow.gif ours_blackswan_willow.gif
ori_camel_desert.gif tficon_camel_desert.gif ours_camel_desert.gif
ori_car_street.gif tficon_car_street.gif ours_car_street.gif

🎞More Cases

morecases.png

About

Training-Free Semantic Video Composition via Pre-trained Diffusion Model

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors