-
-
Notifications
You must be signed in to change notification settings - Fork 514
AMD ROCm
To use AMD ROCm with SD.Next you need to
- Install ROCm libraries first.
- Run SD.Next with
--use-rocmflag to force it to install appropriate version oftorch.
Important
AMD ROCm is officially supported for specific AMD GPUs.
Important
Currently, PyTorch support on Windows is not officially maintained by PyTorch team. See AMD's announcement for more information.
Warning
Unofficial support for other platforms is provided by the community and SD.Next does not guarantee it will work.
Use of any third-party libraries is at your own risk.
- For preview support on Windows platform, see ROCm on Windows section.
- For unofficial support for Windows platform, see ZLUDA page.
Install ROCm:
sudo apt update
wget https://repo.radeon.com/amdgpu-install/6.4.3/ubuntu/noble/amdgpu-install_6.4.60403-1_all.deb
sudo apt install ./amdgpu-install_6.4.60403-1_all.deb
sudo amdgpu-install --usecase=rocm
sudo usermod -a -G render,video $LOGNAME
Install git and python:
sudo apt install git python3 python3-dev python3-venv python3-pipSimply change the wget line from "noble" to "jammy" if using Ubuntu 22.04.
Install prerequisites:
sudo zypper in python312-devel python312-virtualenv python312-pip patterns-devel-base-devel_basisAdd the ROCm repository (not official, but maintained by AMD employees):
sudo zypper ar obs://science:GPU:ROCm/openSUSE_Factory ROCm
sudo zypper ref # Answer "ultimately trust"Install the relevant packages (there is no pattern so you must install them all manually):
sudo zypper in rocm-runtime \
miopen rccl rocblas amdsmi \
hipblaslt hiprand hipcub \
hipsolver hipfft rocm-cmake \
rocm-compilersupport \
rocm-llvm-filesystem \
rocm-clang-runtime-devel \
hipcub-devel rocm-hip-devel \
libhipfft0-devel libhipsolver0-devel \
libhipsparse1-devel rocthrust-devel \
librocfft0 rocm-core rocrand rocsolver This procedure should also work for Leap based distributions and Slowroll (change the relevant lines for the distribution, check the repository page for details), but they haven't been tested.
NOTE: This installs also the build dependencies for flash-attention.
Install ROCm and git:
sudo pacman -S rocm-hip-runtime gitInstall Python 3.12 (or anything between 3.10 and 3.13):
sudo pacman -S base-devel python-pip python-virtualenv
git clone https://aur.archlinux.org/python312.git
cd python312
makepkg -si
cd ..
export PYTHON=python3.12
# remove the package builder residuals:
# rm -rf python312Install ROCm SDK:
Note
ROCm SDK is optional. Only required for building flash atten or similar custom kernels.
ROCm SDK uses 26 GB of disk space.
sudo pacman -S rocm-hip-sdk libxml2-legacy gcc14 gcc14-libsOpen the terminal in a folder you want to install SD.Next and install SD.Next from Github with this command:
git clone https://github.com/vladmandic/sdnextThen enter into the sdnext folder:
cd sdnextThen run SD.Next with this command:
./webui.sh --use-rocmNote
It will install the necessary libraries at the first run so it will take a while depending on your internet.
Checkout the Docker wiki if you want to build a custom Docker image.
Note
Installing ROCm on your system is not required when using Docker as Docker has no access to it anyway.
Using Docker with a prebuilt image:
export SDNEXT_DOCKER_ROOT_FOLDER=~/sdnext
sudo docker run -it \
--name sdnext-rocm \
--device /dev/dri \
--device /dev/kfd \
-p 7860:7860 \
-v $SDNEXT_DOCKER_ROOT_FOLDER/app:/app \
-v $SDNEXT_DOCKER_ROOT_FOLDER/python:/mnt/python \
-v $SDNEXT_DOCKER_ROOT_FOLDER/data:/mnt/data \
-v $SDNEXT_DOCKER_ROOT_FOLDER/models:/mnt/models \
-v $SDNEXT_DOCKER_ROOT_FOLDER/huggingface:/root/.cache/huggingface \
disty0/sdnext-rocm:latest
Note
It will install the necessary libraries at the first run so it will take a while depending on your internet.
Resulting docker image will use 3.2 GB disk space (uncompressed) for the docker image and 20 GB for the venv.
On first use, when using a resolution for the first time, or when upgrading Pytorch versions, ROCm runs a series of benchmarks to select the most efficient approach. This can lead to slow (up to 5-8 minutes) startup times, in particular if you use a refine pass at high resolution, but it only happens once per resolution (subsequent runs are much faster). If for any case this is undesirable, you can set the environment variable MIOPEN_FIND_MODE to FAST. This will reduce a lot the startup time on first use at the price of worse performance when generating.
On the other hand, for best performance during generation (but slower startup on first use), you can set the variable MIOPEN_FIND_ENFORCE to SEARCH.
If you use the bf16 data type (Settings > Compute Settings > Execution Precision > Device precision type), which is autodetected on RDNA3 and newer cards, there is the chance that VRAM usage will be very high (16+ GB) when decoding the final image and when upscaling with non-latent upscalers. To workaround the problem, ensure to set Device precision type as fp16, and disable VAE upcasting in Variational Auto Encoder > VAE upcasting.
Setting fp16 has also a noticeable improvement on performance.
On RDNA3 hardware (RX 7000 series), you can set the option to use CK Flash Attention for improved performance in Compute Settings > Cross Attention > SDP Options by toggling CK Flash attention and restarting SD.Next. Notice that enabling this option requires rocm-hip-sdk installed as it will download and compile an additional Python package from source on startup.
In case you want to install it manually, activate the virtual environment then run pip:
pip install --no-build-isolation git+https://github.com/Disty0/flash-attention@navi_rotary_fix
- Install Git and Python 3.12.
- Open the terminal in a folder you want to install SD.Next and install SD.Next from GitHub with this command:
git clone https://github.com/vladmandic/sdnext- Enter into the sdnext folder:
cd sdnext- Switch to dev branch:
git switch dev- Make sure that you are up to date.
git pull- Run SD.Next with this command:
./webui.bat --use-rocm