-
Notifications
You must be signed in to change notification settings - Fork 630
Implement ciflow/rocm #2114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Implement ciflow/rocm #2114
Conversation
Can this label be automatically added? If not, what happens if it is not manually added -- is rocm test skipped? |
If the label is not added manually then the ROCm tests will not run. CUDA tests can still run normally. |
For the context,
So, let's be clear that adding the label automatically is doable, but only in the first case for privilege users. The mechanism to automatically add the label is independent from the change in this PR, so you can do it in a separate PR later. We need:
One note is that when auto label rule is in place, having both the |
| @@ -0,0 +1,2 @@ | |||
| ciflow_push_tags: | |||
| - ciflow/8gpu | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what does ciflow_push_tags do?
why it's called 8gpu?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ciflow_push_tags tells the bot which labels should map to git tags that trigger CI workflows.
It's called 8gpu following Huy's suggestion on slack https://pytorch.slack.com/archives/C0A0E8ARXPZ/p1764843858514179
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems adding ciflow/8gpu label in a PR will automatically create a tag ciflow/8gpu/[PR_number]. I say so because I observed https://github.com/pytorch/torchtitan/releases/tag/ciflow%2F8gpu%2F2125
What is this ciflow_push_tags for? Is it for letting the bot know that the above happened and which tag to look at?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is correct. As per my understanding, when we add PR label, it'll automatically create & push tag. Because the ciflow_push_tags config tells the bot to create & push the matching tags so that tag-based CI workflows can run.
tianyu-l
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it true that
- for each PR a codeowner needs to explicitly add the
ciflow/8gpulabel, so that rocm test could be run? - if a codeowner doesn't do this, the PR can be merged without rocm test passing?
- after CUDA test is run / finished, if we add the label, we can run rocm test without rerun the CUDA test?
| @@ -0,0 +1,2 @@ | |||
| ciflow_push_tags: | |||
| - ciflow/8gpu | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems adding ciflow/8gpu label in a PR will automatically create a tag ciflow/8gpu/[PR_number]. I say so because I observed https://github.com/pytorch/torchtitan/releases/tag/ciflow%2F8gpu%2F2125
What is this ciflow_push_tags for? Is it for letting the bot know that the above happened and which tag to look at?
| tags: | ||
| - ciflow/8gpu/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's this for -- why do we need both pull_request and tags?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As per my understanding the PR workflows and tag workflows are totally independent. Tags provides CI flow, meaning tags can be pushed to trigger CI runs on specific commits even after the PR is closed. They can also be used for versioning releases.
pull_request are used to run workflows when the PR is open.
…feature to add label automatically.
…RIGGERED_8GPU_LABEL.
| - .ci/docker/** | ||
| - .github/workflows/** | ||
| - scripts/** | ||
| - tests/** | ||
| - torchtitan/** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what are these for?
| IS_SCHEDULE: ${{ github.event_name == 'schedule' }} | ||
| IS_PR: ${{ github.event_name == 'pull_request' }} | ||
| HAS_8GPU_LABEL: ${{ github.event_name == 'pull_request' && contains(github.event.pull_request.labels.*.name, 'ciflow/8gpu') }} | ||
| IS_8GPU_TAG: ${{ startsWith(github.ref, 'refs/tags/ciflow/8gpu/') }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need this, if we already have the one above?
In this PR, I implemented ciflow/rocm. The changes are part of integration_test_8gpu_features.yaml. The workflow still supports running on pull_request (without any PR label) for CUDA. However, along with push to main and cron schedule, with the ciflow/8gpu label added to PR, the workflow runs for both CUDA & ROCm.