Skip to content

Conversation

@gyhandy
Copy link
Contributor

@gyhandy gyhandy commented Apr 18, 2025

Implementation for a spatial-temporal robot augmentation pipeline:

  1. Generating spatial-temporal weight matrices from robot segmentation masks.

  2. Provides two augmentation settings:

    • Preserving both the shape and appearance of robots
    • Preserving only the shape of robots

@gyhandy gyhandy self-assigned this Apr 18, 2025
@pjannaty pjannaty requested a review from lynetcha-nv April 22, 2025 17:33
@pjannaty
Copy link
Contributor

@lynetcha-nv can we please review? @sophiahhuang for vis

@gyhandy
Copy link
Contributor Author

gyhandy commented Apr 23, 2025

Hi @pjannaty , could you please help with the process or upload the new example videos here (assets/robot_augmentation_example/example1), which are used in the README?

@@ -0,0 +1,8 @@
{
"prompt1": "a robotic grasp an apple from the table and move it to another place.",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should vis, edge, depth, and seg control_weight maps pt be included in this json similarly to assets/robot_augmentation_example/example1/inference_cosmos_transfer1_robot_spatiotemporal_weights.json ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This JSON provides examples of candidate prompts as input. To use them, we could substitute the prompt in assets/robot_augmentation_example/example1/inference_cosmos_transfer1_robot_spatiotemporal_weights.json using the candidate prompts in this example1_prompts.json

Your browser does not support the video tag.
</video>

You can run multiple times with different prompts (e.g., `assets/robot_augmentation_example/example1/example1_prompts.json`), and you can get different augmentation results:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please include the full inference command to illustrate how this file is used at inference time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned in the comments above, this file is a reference file to provide candidate prompts.

@pjannaty
Copy link
Contributor

Thank you @gyhandy. Can we please trigger the pipeline [internal]/nvidia-cosmos/cosmos-transfer1 following instructions?

@pjannaty
Copy link
Contributor

LGTM, Thank you @gyhandy!
Tests added. Pipeline passes cosmos-transfer1/-/pipelines/28439374.

@pjannaty pjannaty merged commit d7ce8b7 into main May 14, 2025
4 checks passed
atmguille pushed a commit to atmguille/cosmos-transfer1 that referenced this pull request Jul 16, 2025
* add spatial-temporal weight adding code

* add robot augmentation examples

* update spatial temporal processing code and example

* add readme

* recover

* recover

* add prompts and update readme

* fix comments

* visualization of the mp4

* update readme

* linting
@nanfangxiansheng
Copy link

@gyhandy Sir ,would please add how you prepare the segmentation folder and label like {
"(29, 0, 0, 255)": {
"class": "gripper0_right_r_palm_vis"
},
"(31, 0, 0, 255)": {
"class": "gripper0_right_R_thumb_proximal_base_link_vis"
},
"(33, 0, 0, 255)": {
"class": "gripper0_right_R_thumb_proximal_link_vis"
}
} in the tutorial file

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants