- 
                Notifications
    
You must be signed in to change notification settings  - Fork 254
 
Closed
Labels
bugSomething isn't workingSomething isn't workingupstream-bugWe can't do anything but wait.We can't do anything but wait.
Description
[RANK 0] 2025-10-21 13:15:19,710 [INFO] (simpletuner.helpers.models.common) Moving AutoencoderKLWan to accelerator, converting from torch.float32 to torch.bfloat16
[RANK 0] 2025-10-21 13:15:21,156 [ERROR] (validation) Error generating validation image: device meta is invalid, Traceback (most recent call last):
  File "/notebooks/SimpleTuner/simpletuner/helpers/training/validation.py", line 1675, in validate_prompt
    pipeline_result = self.model.pipeline(**pipeline_kwargs)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/notebooks/SimpleTuner/.venv/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/notebooks/SimpleTuner/simpletuner/helpers/models/wan/pipeline.py", line 1007, in __call__
    return self._call_text_to_video(
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/notebooks/SimpleTuner/simpletuner/helpers/models/wan/pipeline.py", line 643, in _call_text_to_video
    noise_pred_uncond = self.transformer(
                        ^^^^^^^^^^^^^^^^^
  File "/notebooks/SimpleTuner/.venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/notebooks/SimpleTuner/.venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/notebooks/SimpleTuner/.venv/lib/python3.11/site-packages/diffusers/hooks/hooks.py", line 188, in new_forward
    args, kwargs = function_reference.pre_forward(module, *args, **kwargs)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/notebooks/SimpleTuner/.venv/lib/python3.11/site-packages/diffusers/hooks/group_offloading.py", line 304, in pre_forward
    self.group.onload_()
  File "/notebooks/SimpleTuner/.venv/lib/python3.11/site-packages/torch/_dynamo/eval_frame.py", line 1044, in _fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/notebooks/SimpleTuner/.venv/lib/python3.11/site-packages/diffusers/hooks/group_offloading.py", line 260, in onload_
    self._onload_from_disk()
  File "/notebooks/SimpleTuner/.venv/lib/python3.11/site-packages/diffusers/hooks/group_offloading.py", line 189, in _onload_from_disk
    loaded_tensors = safetensors.torch.load_file(self.safetensors_file_path, device=device)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/notebooks/SimpleTuner/.venv/lib/python3.11/site-packages/safetensors/torch.py", line 381, in load_file
    with safe_open(filename, framework="pt", device=device) as f:
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
safetensors_rust.SafetensorError: device meta is invalid
encountered this bad boy when trying to use group offload with wan 2.1 14b t2v.
so my understanding of this, having hit it in HiDream before especially:
- the pipeline gets created with missing denoiser module (such as, for creating text embeds)
 - the transformer is attached to the pipeline when it's loaded
 - the pipeline still thinks 
deviceshould bemetainstead of the actual accelerator 
and in hidream, 9d2cac2 fixed it by just overriding device = self.transformer.device
🤔 so maybe the same fix will work here. but it'd be nicer to understand the root issue. i've traced through it before, but not really looking forward to doing so again.
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workingupstream-bugWe can't do anything but wait.We can't do anything but wait.