SALAD: Skeleton-Aware Latent Diffusion Model for Text-driven Motion Generation and Editing (CVPR 2025)

Seokhyeon Hong, Chaelin Kim, Serin Yoon, Junghyun Nam, Sihun Cha, Junyong Noh

1. Preparation

Details

Environment

conda create -n salad python=3.9 -y
conda activate salad
pip install torch==1.13.1+cu117 --extra-index-url https://download.pytorch.org/whl/cu117
pip install -r requirements.txt

We tested our code on Python 3.9.19 and PyTorch 1.13.1+cu117.

Please note that requirements.txt does not include PyTorch, as its installation depends on your specific hardware and system configuration. To install PyTorch, follow the official installation instructions tailored to your environment, which can be found here.

Dataset

We used the HumanML3D and KIT-ML datasets, which can be obtained from the following link: HumanML3D.

After downloading the datasets, please either copy or link them in the following structure:

salad
└─ dataset
    └─ humanml3d
    └─ kit-ml

Evaluation & Pre-trained Weights

We provide pre-trained weights for both the HumanML3D and KIT-ML datasets. To download them, run the following commands:

bash prepare/download_t2m.sh
bash prepare/download_kit.sh

These scripts will download the pre-trained weights for the SALAD model and evaluation models trained on each dataset.

Additionally, for evaluation, you will need to download the glove as well:

bash prepare/download_glove.sh

2. Playground

Generation & Editing

Follow playground.ipynb to enjoy text-driven motion generation and editing in Jupyter Notebook.

Attention Map Visualization

Follow visualize_attn.ipynb if you want to see the attention maps produced during the generation process.

3. Training and Evaluation

Training from Scratch

For training VAE, follow this:

python train_vae.py --name vae_example --vae_type vae --lambda_kl 2e-2 --activation silu --dataset_name t2m

For training denoiser, follow this:

python train_denoiser.py --name denoiser_example --vae_name vae_example --latent_dim 256 --n_heads 8 --ff_dim 1024 --n_layers 5 --num_inference_timesteps 20

Evaluation

python test_vae.py --name vae_example
python test_denoiser.py --name denoiser_example --num_inference_timesteps 50

Evaluated results will be saved in checkpoints/{vae_name | denoiser_name}/eval/eval.log

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
assets		assets
common		common
data		data
dataset		dataset
models		models
motion_loaders		motion_loaders
options		options
prepare		prepare
utils		utils
visualization		visualization
.gitignore		.gitignore
README.md		README.md
playground.ipynb		playground.ipynb
requirements.txt		requirements.txt
t2m.py		t2m.py
test_denoiser.py		test_denoiser.py
test_vae.py		test_vae.py
train_denoiser.py		train_denoiser.py
train_vae.py		train_vae.py
visualize_attn.ipynb		visualize_attn.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SALAD: Skeleton-Aware Latent Diffusion Model for Text-driven Motion Generation and Editing (CVPR 2025)

1. Preparation

Environment

Dataset

Evaluation & Pre-trained Weights

2. Playground

Generation & Editing

Attention Map Visualization

3. Training and Evaluation

Training from Scratch

Evaluation

About

Uh oh!

Releases

Packages

Languages

JackZhouSz/salad1

Folders and files

Latest commit

History

Repository files navigation

SALAD: Skeleton-Aware Latent Diffusion Model for Text-driven Motion Generation and Editing (CVPR 2025)

1. Preparation

Environment

Dataset

Evaluation & Pre-trained Weights

2. Playground

Generation & Editing

Attention Map Visualization

3. Training and Evaluation

Training from Scratch

Evaluation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages