Skip to content

weight has been created,but interrupt KeyError: 'test_mean_score', #158

@xingye666

Description

@xingye666

Validation epoch 5: 100%|█████████▉| 1685/1686 [02:51<00:00, 10.09it/s]
Error executing job with overrides: ['training.seed=42', 'training.device=cuda:0']
Traceback (most recent call last):
File "/home/diffusion_policy/train.py", line 41, in
main()
File "/usr/local/lib/python3.9/site-packages/hydra/main.py", line 90, in decorated_main
_run_hydra(
File "/usr/local/lib/python3.9/site-packages/hydra/_internal/utils.py", line 389, in _run_hydra
_run_app(
File "/usr/local/lib/python3.9/site-packages/hydra/_internal/utils.py", line 452, in _run_app
run_and_report(
File "/usr/local/lib/python3.9/site-packages/hydra/_internal/utils.py", line 216, in run_and_report
raise ex
File "/usr/local/lib/python3.9/site-packages/hydra/_internal/utils.py", line 213, in run_and_report
return func()
File "/usr/local/lib/python3.9/site-packages/hydra/_internal/utils.py", line 453, in
lambda: hydra.run(
File "/usr/local/lib/python3.9/site-packages/hydra/_internal/hydra.py", line 132, in run
_ = ret.return_value
File "/usr/local/lib/python3.9/site-packages/hydra/core/utils.py", line 260, in return_value
raise self._return_value
File "/usr/local/lib/python3.9/site-packages/hydra/core/utils.py", line 186, in run_job
ret.return_value = task_function(task_cfg)
File "/home/diffusion_policy/train.py", line 38,
workspace.run()
File "/home/diffusion_policy/diffusion_policy/workspace/train_diffusion_unet_hybrid_workspace.py", line 281, in run
topk_ckpt_path = topk_manager.get_ckpt_path(metric_dict)
File "/home/diffusion_policy/diffusion_policy/common/checkpoint_util.py", line 26, in get_ckpt_path
value = data[self.monitor_key]
KeyError: 'test_mean_score'
...
File "/home/diffusion_policy/train.py", line 38, in main
workspace.run(
File "/home/diffusion_policy/diffusion_policy/workspace/train_diffusion_unet_hybrid_workspace.py", line 281, in run
topk_ckpt_path = topk_manager.get_ckpt_path(metric_dict)
File "/home/diffusion_policy/diffusion_policy/common/checkpoint_util.py", line 26, in get_ckpt_path
value = data[self.monitor_key]
KeyError: 'test_mean_score'
...
wandb: train_action_mse_error ▁
wandb: train_action_mse_error 5533.39844

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions