You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
very large data that wants to be uploaded (pushed) as early as possible to the DataHUB
limited local storage, git by design duplicates data
e.g., if raw data on workstation is 5 TB, workstation free storage is 9 TB, git commit on all data in one go would not work
Considerations
File processing location (local vs. remote)
Are the files still needed locally? Or are they already processed and it is enough to keep the derived data locally?
Possible solution
Add / commit / push each file individually (or in small batches)
After push, replace the local ARC with a fresh clone excluding the LFS objects
sequenceDiagram
participant LocalData as Local Data folder
participant LocalARC as Local ARC folder
participant DataHUB as DataHUB ARC repository
LocalARC->>DataHUB: Initialize empty ARC
Note over LocalARC: Loop file-by-file upload process
loop For each fileX from 1 to n
LocalData->>LocalARC: Move fileX to ARC
LocalARC->>LocalARC: git [lfs track + add + commit] fileX
LocalARC->>DataHUB: git push
DataHUB->>LocalARC: Option 1: Replace local ARC with LFS-filtered clone
DataHUB->>LocalARC: Option 2: Replace fileX with LFS pointer (lfs migrate export)
end
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
Use-case
Considerations
Possible solution
sequenceDiagram participant LocalData as Local Data folder participant LocalARC as Local ARC folder participant DataHUB as DataHUB ARC repository LocalARC->>DataHUB: Initialize empty ARC Note over LocalARC: Loop file-by-file upload process loop For each fileX from 1 to n LocalData->>LocalARC: Move fileX to ARC LocalARC->>LocalARC: git [lfs track + add + commit] fileX LocalARC->>DataHUB: git push DataHUB->>LocalARC: Option 1: Replace local ARC with LFS-filtered clone DataHUB->>LocalARC: Option 2: Replace fileX with LFS pointer (lfs migrate export) end@j-bauer @TetraW
Beta Was this translation helpful? Give feedback.
All reactions