Diffusion Sprite Generator

A PyTorch implementation of a Denoising Diffusion Probabilistic Model (DDPM) for generating 16x16 sprite images. This project uses a U-Net based architecture with context embeddings to learn and generate sprite images through the diffusion process.

Overview

This project implements a diffusion model that can generate sprite images by learning to reverse a gradual noising process. The model uses a ContextUnet architecture, which is a U-Net with residual connections and context embeddings for conditional generation.

Features

DDPM Implementation: Full implementation of the Denoising Diffusion Probabilistic Model
U-Net Architecture: Context-aware U-Net with residual blocks for noise prediction
Sprite Generation: Optimized for 16x16 RGB sprite images
Context Embeddings: Support for conditional generation with context labels
Training & Inference: Complete training pipeline and generation script

Project Structure

diffusion-sprite/
├── config.py          # Hyperparameters and configuration
├── dataset.py         # Custom PyTorch dataset for sprite data
├── model.py           # ContextUnet architecture and components
├── train.py           # Training script
├── generate.py        # Generation/sampling script
├── utils.py           # Utility functions for visualization
└── requirements.txt   # Python dependencies

Installation

Clone the repository:

git clone <repository-url>
cd diffusion-sprite

Install dependencies:

pip install -r requirements.txt

Configuration

The config.py file contains all hyperparameters:

Network: N_FEAT (64), N_CFEAT (5), HEIGHT (16)
Diffusion: TIMESTEPS (500), BETA1 (1e-4), BETA2 (0.02)
Training: BATCH_SIZE (100), N_EPOCH (32), LRATE (1e-3)

You can modify these values to adjust model capacity, training duration, and diffusion schedule.

Data Format

The dataset expects two NumPy files:

Sprite images: Shape (N, 16, 16, 3) - RGB sprite images
Labels: Shape (N,) - Context labels for each sprite

Example dataset paths (as used in train.py):

../datasets/sprites/sprites_1788_16x16.npy
../datasets/sprites/sprite_labels_nc_1788_16x16.npy

Usage

Training

Train the diffusion model on your sprite dataset:

python train.py

The training script will:

Load the dataset from the specified paths
Train the model for the configured number of epochs
Save model checkpoints every 4 epochs to ./weights/
Use linear learning rate decay

Note: Update the dataset paths in train.py (line 18) to point to your data files.

Generation

Generate new sprites from a trained model:

python generate.py

The generation script will:

Load the trained model from ./weights/model_trained.pth
Generate 20 samples using the DDPM sampling process
Save intermediate generation steps to intermediate_images/ directory
Display progress for each timestep

Note: Make sure you have a trained model saved as model_trained.pth in the ./weights/ directory, or update the path in generate.py.

Model Architecture

The ContextUnet model consists of:

Encoder (Downsampling):
- Initial residual convolution block
- Two downsampling blocks with residual connections
- Average pooling to bottleneck
Context & Time Embeddings:
- Separate embeddings for timestep and context labels
- Multi-layer feedforward networks
Decoder (Upsampling):
- Transposed convolutions for upsampling
- Skip connections from encoder
- Residual blocks for feature refinement
- Final convolution to output channels

How It Works

Forward Process (Training): Images are gradually corrupted with Gaussian noise over T timesteps
Reverse Process (Generation): The model learns to predict and remove noise at each timestep
Sampling: Starting from pure noise, the model iteratively denoises to generate new images

Output

Training: Model checkpoints saved to ./weights/model_{epoch}.pth
Generation:
- Final generated sprites
- Intermediate generation grids saved to intermediate_images/ showing the denoising process

Requirements

Python 3.7+
PyTorch >= 2.0.0
torchvision >= 0.15.0
numpy >= 1.21.0
matplotlib >= 3.4.0
tqdm >= 4.65.0

Acknowledgments

This implementation is based on the Denoising Diffusion Probabilistic Models (DDPM) paper by Ho et al. (2020).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Diffusion Sprite Generator

Overview

Features

Project Structure

Installation

Configuration

Data Format

Usage

Training

Generation

Model Architecture

How It Works

Output

Requirements

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md
config.py		config.py
dataset.py		dataset.py
generate.py		generate.py
model.py		model.py
requirements.txt		requirements.txt
train.py		train.py
utils.py		utils.py

Folders and files

Latest commit

History

Repository files navigation

Diffusion Sprite Generator

Overview

Features

Project Structure

Installation

Configuration

Data Format

Usage

Training

Generation

Model Architecture

How It Works

Output

Requirements

Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages