Official PyTorch implementation for the paper:
Thinking Inside the Mask: In-Place Prompting in Diffusion LLMs
Xiangqi Jin, Yuxuan Wang, Yifeng Gao, Zichen Wen, Biqing Qi, Dongrui Liu, Linfeng Zhang
arXiv:2508.10736
ICE transforms prefix-only prompting into in-place prompting for diffusion large language models (dLLMs). By leveraging the bidirectional attention mechanisms and iterative refinement processes of dLLMs, ICE integrates in-place prompts directly within masked token positions during generation and employs a confidence-aware early exit mechanism to significantly reduce computational overhead.
- Clone the repository:
git clone https://github.com/Lueci4er/ICE.git
cd ICE- Install dependencies:
pip install -r requirements.txt- Configure
run_eval.sh- Set model path, task, and generation parameters - Run the script:
bash run_eval.shStep 1: Edit run_eval.sh - Set model path:
MODEL_PATH="${MODEL_PATH:-/path/to/LLaDA-8B-Instruct}"Step 2: Edit run_eval.sh - Configure parameters:
CONFIDENCE_THRESHOLD=0.9
THINKING_STEPS=5
THINKING_PATTERN="uniform"Step 3: Edit run_eval.sh - Uncomment a task example (e.g., GSM8K):
run_experiment \
--task "gsm8k_cot" \
--num_fewshot 4 \
--block_length 256 \
--gen_length 256 \
--steps 256 \
--answer_length 10 \
--answer_prompt "The answer is" \
--confidence_threshold 0.9 \
--thinking_steps 5 \
--thinking_pattern "uniform"Step 4: Run:
bash run_eval.shStep 5: Check results:
ls results/If you find this work useful, please cite:
@article{jin2025thinking,
title={Thinking inside the mask: In-place prompting in diffusion llms},
author={Jin, Xiangqi and Wang, Yuxuan and Gao, Yifeng and Wen, Zichen and Qi, Biqing and Liu, Dongrui and Zhang, Linfeng},
journal={arXiv preprint arXiv:2508.10736},
year={2025}
}