ComfyUI-multiGPU-upscaler

Multi-GPU batch-parallel upscaling nodes for ComfyUI.

Features

This extension is designed to:

Use 1–10 GPUs efficiently.
By default, auto-detect and use up to 2 GPUs.
Split batched images across GPUs and upscale them in parallel.
Use robust tiled upscaling with OOM-safe fallback.
Work great with RealESRGAN / ESRGAN style models (e.g. RealESRGAN_4xplus).

Tested with:

Dual RTX 3060 setup
SDXL generation + 4x RealESRGAN upscaling, batch 4–8
Achieved measurable speedups vs single-GPU upscaling.

Requirements

NVIDIA GPUs only. This extension relies on CUDA for device management and communication.

Performance

Here are some sample benchmarks comparing a standard Upscale Image (using Model) node against the multiGPU_upscaler node.

Test Setup: Dual RTX 3060
Workflow: SDXL Generation + 4x RealESRGAN Upscaling
Resolution: 1024x1024 upscaled to 4K (4096x4096)
Batch Size: 8 (split as 4 images per GPU in the multi-GPU test)

Run Type	Standard Upscaler (1 GPU)	multiGPU Upscaler (2 GPUs)	Speedup
Cold Run (gen0)	261.81s	233.53s	~10.8%
Run 1 (gen1)	259.57s	223.22s	~14.0%
Run 2 (gen2)	251.65s	226.82s	~9.9%

Results show a measurable speedup, especially on repetitive runs, by parallelizing the upscale task across both GPUs.

Installation

Go to your ComfyUI custom_nodes directory.

Example:
```
cd ComfyUI/custom_nodes
```

Clone this repository:

git clone [https://github.com/alludus/ComfyUI-multiGPU-upscaler.git](https://github.com/alludus/ComfyUI-multiGPU-upscaler.git)

Or download the ZIP from GitHub and extract it to:

ComfyUI/custom_nodes/ComfyUI-multiGPU-upscaler/

Ensure the structure looks like this:

ComfyUI/custom_nodes/ComfyUI-multiGPU-upscaler/__init__.py
ComfyUI/custom_nodes/ComfyUI-multiGPU-upscaler/multiGPU_upscaler.py

Restart ComfyUI.

All nodes will appear under the category:

multiGPU_upscaler

Nodes

multiGPU_upscaler: Multi-GPU Batch Parallel

Main node. Splits the batch across multiple GPUs and upscales in parallel.

Best for:

Batch size ≥ 4
2 or more GPUs
Post-generation upscaling (e.g. SDXL → 8 images → 4x RealESRGAN upscale)
Inputs:
- upscale_model: Note: Load this using a standard Load Upscale Model node.
- image: Batched input tensor from ComfyUI.
- device_list:
  - How to select GPUs.
  - auto (default): Uses up to auto_max_devices GPUs with the most free VRAM.
  - Custom list (Examples: cuda:0,cuda:1 or 0,1,2).
  - Up to 10 GPUs supported.
- auto_max_devices:
  - Default: 2
  - Used only when device_list = "auto".
  - Limits the number of GPUs auto mode uses.
- primary_share:
  - Default: 0.5
  - Approximate fraction of the batch assigned to the first (best) GPU.
  - If one GPU is stronger or has more free VRAM, increase (e.g. 0.7).
- tile_size:
  - Default: 512
  - Starting tile size on all GPUs. Automatically reduced on OOM.
- min_tile_size:
  - Default: 128
  - Smallest allowed tile size before failing.
- overlap:
  - Default: 32
  - Tile overlap in pixels.
Behavior:
- Determines GPUs:
  - If device_list is set: Uses exactly that set (filtered by availability).
  - If device_list = "auto": Uses up to auto_max_devices GPUs with the most free VRAM.
- Splits Batch:
  - First GPU receives about primary_share of the images.
  - Remaining GPUs share the rest.
- Executes:
  - Spawns a worker thread for each GPU.
  - Each worker instantiates its own copy of the model on that GPU.
  - Each worker runs tiled upscaling on its subset with OOM-safe tiling.
- Finishes:
  - Outputs are concatenated in the original batch order.
  - If any worker errors or OOMs, it falls back to a single-GPU tiled upscale on the best available GPU.

Recommended Settings

For a setup like:

2x RTX 3060
SDXL generation
RealESRGAN_4xplus 4x upscaling
Batch size 4–8

Recommended Node: multiGPU_upscaler: Multi-GPU Batch Parallel

Settings:
- device_list: auto
- auto_max_devices: 2
- primary_share: 0.5
- tile_size: 512
- min_tile_size: 128
- overlap: 32

This configuration lets the extension pick the two best GPUs, splits work evenly, and uses robust tiling.

Tips & Debugging

If you encounter OOM:
- Lower tile_size (e.g. to 256).
- Optionally increase min_tile_size to reduce retries.
If one GPU is stronger:
- Increase primary_share (e.g. 0.6–0.8) so it does more work.
Debugging:
- Watch the ComfyUI console/log for [multiGPU] messages.
- Use nvidia-smi to confirm multiple GPUs are active during upscaling.

License

This project is released under the Apache 2.0 License.

See the LICENCE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.github/workflows		.github/workflows
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
logo.svg		logo.svg
multiGPU_upscaler.py		multiGPU_upscaler.py
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ComfyUI-multiGPU-upscaler

Table of Contents

Features

Requirements

Performance

Installation

Nodes

multiGPU_upscaler: Multi-GPU Batch Parallel

Recommended Settings

Tips & Debugging

License

About

Uh oh!

Releases

Packages

Languages

License

alludus/ComfyUI-multiGPU-upscaler

Folders and files

Latest commit

History

Repository files navigation

ComfyUI-multiGPU-upscaler

Table of Contents

Features

Requirements

Performance

Installation

Nodes

multiGPU_upscaler: Multi-GPU Batch Parallel

Recommended Settings

Tips & Debugging

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages