Skip to content

Find available CPU for running ray AOI pipeline. #146

Open
tcnichol wants to merge 26 commits intomainfrom
ray-todd-scratch-2
Open

Find available CPU for running ray AOI pipeline. #146
tcnichol wants to merge 26 commits intomainfrom
ray-todd-scratch-2

Conversation

@tcnichol
Copy link
Copy Markdown
Collaborator

I added a few new features to the ray AOI pipeline.

Previously there were more CPUs available, but typically only a small number were used when running the pipeline. I added some more code as a first draft to make sure that the pipeline is using the available CPUs. Other changes are needing to check the current network configuration. New method added to utils.cuda

@tcnichol tcnichol requested a review from relativityhd July 23, 2025 17:41
@tcnichol tcnichol changed the title Find Proper CPU and network interface for running in ray Find available CPU for running ray AOI pipeline. Jul 23, 2025
@tcnichol
Copy link
Copy Markdown
Collaborator Author

Note - running this I used the environment from the pixi.toml file environment cuda128. That matched our setup at NCSA.

@ray.remote
def init_worker():
# Set critical CUDA variables before any imports
os.environ["CUDA_VISIBLE_DEVICES"] = "0" # Or your device index
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, shouldn't this be done automatically?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants