This repository contains the implementation of Watermarking Diffusion Language Models. Our watermark is the first watermark tailored for Diffusion Language Models.
We present the first watermark tailored for Diffusion Language Models. Our watermark extends Red-Green watermarks, originally designed for Autoregressive Language Models, by applying them in expectation over the context hashes and, leveraging the capabilities of Diffusion Language Models, biasing tokens that lead to hashes making other tokens green.
- CUDA-compatible GPU with CUDA 12.9
We recommend using uv to install the environment.
- Create a virtual environment:
uv venv --python 3.12 --seed
source .venv/bin/activate- Install the dependencies:
uv pip install -r requirements.txt --torch-backend="auto"- Install the main package:
uv pip install -e .For the KTH watermark baseline, we rely on a custom rust-based implementation of the detector. It can be installed with:
uv pip install additional/levenshtein_rust-0.1.0-cp311-cp311-manylinux_2_28_x86_64.whlFor more information, refer to additional/README.md.
While this repository contains all the code needed to reproduce our experiments, our watermark specific implementation is in src/dlm_watermark/watermarks/diffusion_watermark.py.
To quickly evaluate our watermark, run
python scripts/run_config.py --config configs/main/Llada/ourWatermark_llada8b_instruct.yamlSpecifically, we configure the model and watermark through .yaml configuration files.
You can find examples of such configuration in configs.
For more information, please refer to src/dlm_watermark/configs.py.
src/dlm_watermark/: Python package with the watermark implementations, model wrappers, and helpers powering the experiments.watermarks/: all watermark algorithms, including the diffusion watermark atdiffusion_watermark.pyand baselines for comparison.models/: lightweight adapters around diffusion language models (e.g., Llada, Dream, DreamOn) and shared generation utilities.quality_evaluations/: judges and metrics used to measure watermark impact (perplexity, quality scores, etc.).configs.py: dataclasses describing the YAML configuration schema used throughout the project.
configs/: ready-to-run YAML configs covering main experiments and ablations.scripts/: entrypoints for launching experiments, ablations, and evaluation pipelines; bash wrappers inscripts/bash/reproduce the paper main results.data/: small reference datasets (e.g., WaterBench, infilling prompts) needed for evaluating the watermark.additional/: optional Rust-based Levenshtein detector for the KTH baseline plus build instructions.
We provide bash scripts in scripts/bash to reproduce our main experiments and all needed scripts are in the scripts folder.
@inproceedings{
gloaguen2026watermarking,
title={Watermarking Diffusion Language Models},
author={Thibaud Gloaguen and Robin Staab and Nikola Jovanovi{\'c} and Martin Vechev},
booktitle={The Fourteenth International Conference on Learning Representations},
year={2026},
url={https://openreview.net/forum?id=3aBWTYGcaT}
}