CUDA Out of Memeory Issue

Hi - I'm running Cosmos - Predict on Amazon EC2 Instance with L40S Tensor Core GPU and getting CUDA out of memeory issue. I tried adding  `--resolution 480` but no luck. Any help will be appreciated. Added `nvidia-smi` at the bottom.

`root@ea16e6e83adf:/workspace# python -m examples.video2world     --model_size 2B     --input_path assets/video2world/input0.jpg     --num_conditional_frames 1     --prompt "${PROMPT}"     --save_path output/video2world_2b.mp4     --fps 16     --resolution 480`
fatal: detected dubious ownership in repository at '/workspace'
To add an exception for this directory, call:

	git config --global --add safe.directory /workspace
[09-25 20:05:11|INFO|imaginaire/constants.py:39:print_environment_info] imaginaire.constants: Namespace(checkpoints='checkpoints', text_encoder=<TextEncoderClass.T5: 't5'>)
[09-25 20:05:11|INFO|imaginaire/constants.py:40:print_environment_info] sys.argv: ['/workspace/examples/video2world.py', '--model_size', '2B', '--input_path', 'assets/video2world/input0.jpg', '--num_conditional_frames', '1', '--prompt', "A nighttime city bus terminal gradually shifts from stillness to subtle movement.\nAt first, multiple double-decker buses are parked under the glow of overhead lights, with\na central bus labeled '87D' facing forward and stationary. As the video progresses, the\nbus in the middle moves ahead slowly, its headlights brightening the surrounding area\nand casting reflections onto adjacent vehicles. The motion creates space in the lineup,\nsignaling activity within the otherwise quiet station. It then comes to a smooth stop\nresuming its position in line. Overhead signage in Chinese characters remains illuminated,\nenhancing the vibrant, urban night scene.", '--save_path', 'output/video2world_2b.mp4', '--fps', '16', '--resolution', '480']
[09-25 20:05:11|INFO|imaginaire/constants.py:41:print_environment_info] args: Namespace(model_size='2B', resolution='480', fps=16, dit_path='', load_ema=False, prompt="A nighttime city bus terminal gradually shifts from stillness to subtle movement.\nAt first, multiple double-decker buses are parked under the glow of overhead lights, with\na central bus labeled '87D' facing forward and stationary. As the video progresses, the\nbus in the middle moves ahead slowly, its headlights brightening the surrounding area\nand casting reflections onto adjacent vehicles. The motion creates space in the lineup,\nsignaling activity within the otherwise quiet station. It then comes to a smooth stop\nresuming its position in line. Overhead signage in Chinese characters remains illuminated,\nenhancing the vibrant, urban night scene.", input_path='assets/video2world/input0.jpg', negative_prompt='The video captures a series of frames showing ugly scenes, static with no motion, motion blur, over-saturation, shaky footage, low resolution, grainy texture, pixelated images, poorly lit areas, underexposed and overexposed scenes, poor color balance, washed out colors, choppy sequences, jerky movements, low frame rate, artifacting, color banding, unnatural transitions, outdated special effects, fake elements, unconvincing visuals, poorly edited content, jump cuts, visual noise, and flickering. Overall, the video is of poor quality.', aspect_ratio='16:9', num_conditional_frames=1, batch_input_json=None, guidance=7, seed=0, save_path='output/video2world_2b.mp4', num_gpus=1, disable_guardrail=False, offload_guardrail=False, disable_prompt_refiner=False, offload_prompt_refiner=False, offload_text_encoder=False, downcast_text_encoder=False, benchmark=False, use_cuda_graphs=False, natten=False)
[09-25 20:05:11|INFO|examples/video2world.py:210:setup_pipeline] Using dit_path: checkpoints/nvidia/Cosmos-Predict2-2B-Video2World/model-480p-16fps.pt
[09-25 20:05:11|INFO|imaginaire/utils/misc.py:139:set_random_seed] Using random seed 0.
[09-25 20:05:11|WARNING|imaginaire/lazy_config/lazy.py:441:save_yaml] Config is saved using omegaconf at output/video2world_2b.yaml.
[09-25 20:05:11|INFO|examples/video2world.py:259:setup_pipeline] Initializing Video2WorldPipeline with model size: 2B
[09-25 20:05:11|WARNING|cosmos_predict2/pipelines/video2world.py:292:from_config] precision torch.bfloat16
[09-25 20:05:11|INFO|cosmos_predict2/tokenizers/tokenizer.py:599:_video_vae] Loading checkpoints/nvidia/Cosmos-Predict2-2B-Video2World/tokenizer/tokenizer.pth
[09-25 20:05:11|SUCCESS|cosmos_predict2/tokenizers/tokenizer.py:601:_video_vae] Successfully loaded checkpoints/nvidia/Cosmos-Predict2-2B-Video2World/tokenizer/tokenizer.pth
[09-25 20:06:14|INFO|imaginaire/auxiliary/text_encoder.py:345:__init__] T5 Text encoder model instantiated
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 24.29it/s]
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:01<00:00,  2.33it/s]
Traceback (most recent call last):
  File "/root/.local/share/uv/python/cpython-3.10.18-linux-x86_64-gnu/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/root/.local/share/uv/python/cpython-3.10.18-linux-x86_64-gnu/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/workspace/examples/video2world.py", line 417, in <module>
    pipe = setup_pipeline(args)
  File "/workspace/examples/video2world.py", line 260, in setup_pipeline
    pipe = Video2WorldPipeline.from_config(
  File "/workspace/cosmos_predict2/pipelines/video2world.py", line 342, in from_config
    pipe.text_guardrail_runner = guardrail_presets.create_text_guardrail_runner(
  File "/workspace/cosmos_predict2/auxiliary/guardrail/common/presets.py", line 33, in create_text_guardrail_runner
    LlamaGuard3(checkpoint_dir=checkpoint_dir, offload_model_to_cpu=offload_model_to_cpu),
  File "/workspace/cosmos_predict2/auxiliary/guardrail/llamaGuard3/llamaGuard3.py", line 55, in __init__
    self.model = self.model.to("cuda", dtype=self.dtype).eval()
  File "/workspace/.venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3698, in to
    return super().to(*args, **kwargs)
  File "/workspace/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1343, in to
    return self._apply(convert)
  File "/workspace/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 903, in _apply
    module._apply(fn)
  File "/workspace/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 903, in _apply
    module._apply(fn)
  File "/workspace/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 903, in _apply
    module._apply(fn)
  [Previous line repeated 2 more times]
  File "/workspace/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 930, in _apply
    param_applied = fn(param)
  File "/workspace/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1329, in convert
    return t.to(
torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 112.00 MiB. GPU 0 has a total capacity of 44.53 GiB of which 109.62 MiB is free. Process 9012 has 44.20 GiB memory in use. Of the allocated memory 43.44 GiB is allocated by PyTorch, and 287.90 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

------------------------------------------------------------------------------------------------------------------------------------------

ubuntu@ip-10-0-0-143:~/environment/cosmos-predict2$ nvidia-smi
Thu Sep 25 19:27:23 2025       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.261.03             Driver Version: 535.261.03   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA L40S                    On  | 00000000:38:00.0 Off |                    0 |
| N/A   32C    P8              25W / 350W |    182MiB / 46068MiB |     11%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
|   1  NVIDIA L40S                    On  | 00000000:3A:00.0 Off |                    0 |
| N/A   34C    P8              33W / 350W |     29MiB / 46068MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
|   2  NVIDIA L40S                    On  | 00000000:3C:00.0 Off |                    0 |
| N/A   34C    P8              32W / 350W |     29MiB / 46068MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
|   3  NVIDIA L40S                    On  | 00000000:3E:00.0 Off |                    0 |
| N/A   35C    P8              32W / 350W |     29MiB / 46068MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CUDA Out of Memeory Issue #170

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

CUDA Out of Memeory Issue #170

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions