-
Notifications
You must be signed in to change notification settings - Fork 254
Open
Labels
apiThis issue or enhancement impacts the API.This issue or enhancement impacts the API.enhancementNew feature or requestNew feature or requestwebuiThis issue or enhancement impacts the web interface.This issue or enhancement impacts the web interface.
Description
Similar vein to #1815 but we want to trigger checkpoint export in case of emergency, or for automated purposes via external tools.
The checkpoint would trigger while the training loop is going, after the next gradient sync completes.
This will allow the current gradient accumulation (if any) to complete first.
Metadata
Metadata
Assignees
Labels
apiThis issue or enhancement impacts the API.This issue or enhancement impacts the API.enhancementNew feature or requestNew feature or requestwebuiThis issue or enhancement impacts the web interface.This issue or enhancement impacts the web interface.