Skip to content
This repository was archived by the owner on Dec 3, 2025. It is now read-only.
This repository was archived by the owner on Dec 3, 2025. It is now read-only.

CUDA Out of Memeory Issue #170

@asriaws

Description

@asriaws

Hi - I'm running Cosmos - Predict on Amazon EC2 Instance with L40S Tensor Core GPU and getting CUDA out of memeory issue. I tried adding --resolution 480 but no luck. Any help will be appreciated. Added nvidia-smi at the bottom.

root@ea16e6e83adf:/workspace# python -m examples.video2world --model_size 2B --input_path assets/video2world/input0.jpg --num_conditional_frames 1 --prompt "${PROMPT}" --save_path output/video2world_2b.mp4 --fps 16 --resolution 480
fatal: detected dubious ownership in repository at '/workspace'
To add an exception for this directory, call:

git config --global --add safe.directory /workspace

[09-25 20:05:11|INFO|imaginaire/constants.py:39:print_environment_info] imaginaire.constants: Namespace(checkpoints='checkpoints', text_encoder=<TextEncoderClass.T5: 't5'>)
[09-25 20:05:11|INFO|imaginaire/constants.py:40:print_environment_info] sys.argv: ['/workspace/examples/video2world.py', '--model_size', '2B', '--input_path', 'assets/video2world/input0.jpg', '--num_conditional_frames', '1', '--prompt', "A nighttime city bus terminal gradually shifts from stillness to subtle movement.\nAt first, multiple double-decker buses are parked under the glow of overhead lights, with\na central bus labeled '87D' facing forward and stationary. As the video progresses, the\nbus in the middle moves ahead slowly, its headlights brightening the surrounding area\nand casting reflections onto adjacent vehicles. The motion creates space in the lineup,\nsignaling activity within the otherwise quiet station. It then comes to a smooth stop\nresuming its position in line. Overhead signage in Chinese characters remains illuminated,\nenhancing the vibrant, urban night scene.", '--save_path', 'output/video2world_2b.mp4', '--fps', '16', '--resolution', '480']
[09-25 20:05:11|INFO|imaginaire/constants.py:41:print_environment_info] args: Namespace(model_size='2B', resolution='480', fps=16, dit_path='', load_ema=False, prompt="A nighttime city bus terminal gradually shifts from stillness to subtle movement.\nAt first, multiple double-decker buses are parked under the glow of overhead lights, with\na central bus labeled '87D' facing forward and stationary. As the video progresses, the\nbus in the middle moves ahead slowly, its headlights brightening the surrounding area\nand casting reflections onto adjacent vehicles. The motion creates space in the lineup,\nsignaling activity within the otherwise quiet station. It then comes to a smooth stop\nresuming its position in line. Overhead signage in Chinese characters remains illuminated,\nenhancing the vibrant, urban night scene.", input_path='assets/video2world/input0.jpg', negative_prompt='The video captures a series of frames showing ugly scenes, static with no motion, motion blur, over-saturation, shaky footage, low resolution, grainy texture, pixelated images, poorly lit areas, underexposed and overexposed scenes, poor color balance, washed out colors, choppy sequences, jerky movements, low frame rate, artifacting, color banding, unnatural transitions, outdated special effects, fake elements, unconvincing visuals, poorly edited content, jump cuts, visual noise, and flickering. Overall, the video is of poor quality.', aspect_ratio='16:9', num_conditional_frames=1, batch_input_json=None, guidance=7, seed=0, save_path='output/video2world_2b.mp4', num_gpus=1, disable_guardrail=False, offload_guardrail=False, disable_prompt_refiner=False, offload_prompt_refiner=False, offload_text_encoder=False, downcast_text_encoder=False, benchmark=False, use_cuda_graphs=False, natten=False)
[09-25 20:05:11|INFO|examples/video2world.py:210:setup_pipeline] Using dit_path: checkpoints/nvidia/Cosmos-Predict2-2B-Video2World/model-480p-16fps.pt
[09-25 20:05:11|INFO|imaginaire/utils/misc.py:139:set_random_seed] Using random seed 0.
[09-25 20:05:11|WARNING|imaginaire/lazy_config/lazy.py:441:save_yaml] Config is saved using omegaconf at output/video2world_2b.yaml.
[09-25 20:05:11|INFO|examples/video2world.py:259:setup_pipeline] Initializing Video2WorldPipeline with model size: 2B
[09-25 20:05:11|WARNING|cosmos_predict2/pipelines/video2world.py:292:from_config] precision torch.bfloat16
[09-25 20:05:11|INFO|cosmos_predict2/tokenizers/tokenizer.py:599:_video_vae] Loading checkpoints/nvidia/Cosmos-Predict2-2B-Video2World/tokenizer/tokenizer.pth
[09-25 20:05:11|SUCCESS|cosmos_predict2/tokenizers/tokenizer.py:601:_video_vae] Successfully loaded checkpoints/nvidia/Cosmos-Predict2-2B-Video2World/tokenizer/tokenizer.pth
[09-25 20:06:14|INFO|imaginaire/auxiliary/text_encoder.py:345:init] T5 Text encoder model instantiated
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 24.29it/s]
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:01<00:00, 2.33it/s]
Traceback (most recent call last):
File "/root/.local/share/uv/python/cpython-3.10.18-linux-x86_64-gnu/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/root/.local/share/uv/python/cpython-3.10.18-linux-x86_64-gnu/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/workspace/examples/video2world.py", line 417, in
pipe = setup_pipeline(args)
File "/workspace/examples/video2world.py", line 260, in setup_pipeline
pipe = Video2WorldPipeline.from_config(
File "/workspace/cosmos_predict2/pipelines/video2world.py", line 342, in from_config
pipe.text_guardrail_runner = guardrail_presets.create_text_guardrail_runner(
File "/workspace/cosmos_predict2/auxiliary/guardrail/common/presets.py", line 33, in create_text_guardrail_runner
LlamaGuard3(checkpoint_dir=checkpoint_dir, offload_model_to_cpu=offload_model_to_cpu),
File "/workspace/cosmos_predict2/auxiliary/guardrail/llamaGuard3/llamaGuard3.py", line 55, in init
self.model = self.model.to("cuda", dtype=self.dtype).eval()
File "/workspace/.venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3698, in to
return super().to(*args, **kwargs)
File "/workspace/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1343, in to
return self._apply(convert)
File "/workspace/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 903, in _apply
module._apply(fn)
File "/workspace/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 903, in _apply
module._apply(fn)
File "/workspace/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 903, in _apply
module._apply(fn)
[Previous line repeated 2 more times]
File "/workspace/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 930, in _apply
param_applied = fn(param)
File "/workspace/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1329, in convert
return t.to(
torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 112.00 MiB. GPU 0 has a total capacity of 44.53 GiB of which 109.62 MiB is free. Process 9012 has 44.20 GiB memory in use. Of the allocated memory 43.44 GiB is allocated by PyTorch, and 287.90 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)


ubuntu@ip-10-0-0-143:~/environment/cosmos-predict2$ nvidia-smi
Thu Sep 25 19:27:23 2025
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.261.03 Driver Version: 535.261.03 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA L40S On | 00000000:38:00.0 Off | 0 |
| N/A 32C P8 25W / 350W | 182MiB / 46068MiB | 11% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
| 1 NVIDIA L40S On | 00000000:3A:00.0 Off | 0 |
| N/A 34C P8 33W / 350W | 29MiB / 46068MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
| 2 NVIDIA L40S On | 00000000:3C:00.0 Off | 0 |
| N/A 34C P8 32W / 350W | 29MiB / 46068MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
| 3 NVIDIA L40S On | 00000000:3E:00.0 Off | 0 |
| N/A 35C P8 32W / 350W | 29MiB / 46068MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions