-
Notifications
You must be signed in to change notification settings - Fork 401
Let libcurl downloads be managed via lockfiles #3735
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
We want to avoid race conditions on gmt_data_server.txt and gmt_hash_server.txt.
If for whatever reason the gmt_data_server.txt ends up blank, reading will fail. Rather than just complain about it, we now delete the file since we (a) know it is broken and (b) unless it is removed we will not attempt to refresh it for another cycle (by default 24 hours).
|
The latest commit (delete gmt_server_data.txt if faulty) should also help as it will allow a refresh teh next time accessed. Note that the gmtserver apparently has powered down due to the hurricane so you cannot really test on oceania until we resurface. |
|
THis works well for me and I have not had any gmt_data_server.txt errors since adding the flock and remove. |
seisman
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
* Try to use filelock for downloads We want to avoid race conditions on gmt_data_server.txt and gmt_hash_server.txt. * Update gmt_remote.c * Delete a failing gmt_data_server.txt file so it can be regenerated If for whatever reason the gmt_data_server.txt ends up blank, reading will fail. Rather than just complain about it, we now delete the file since we (a) know it is broken and (b) unless it is removed we will not attempt to refresh it for another cycle (by default 24 hours).
* Let libcurl downloads be managed via lockfiles (#3735) * Try to use filelock for downloads We want to avoid race conditions on gmt_data_server.txt and gmt_hash_server.txt. * Update gmt_remote.c * Delete a failing gmt_data_server.txt file so it can be regenerated If for whatever reason the gmt_data_server.txt ends up blank, reading will fail. Rather than just complain about it, we now delete the file since we (a) know it is broken and (b) unless it is removed we will not attempt to refresh it for another cycle (by default 24 hours). * Exempt URL queries from having lock files (#3768) Since the URL of a quiery is not a good file name, we do not create an advisory lock file for such downloads. Addresses #3765 hopefully. * Checked wrong string for URL (#3780) Hopefully fixes #3765. * Forgot the other place where locking occurred (#3785) Now both places where lockfiles are used avoids URL queries. Closes #3765. * Must close file before delete it (#3804) Co-authored-by: Paul Wessel <[email protected]> Co-authored-by: Joaquim <[email protected]>
Description of proposed changes
When the time is up to refresh the gmt_data_server.txt file, and you run several commands, possibly simultaneously, that all need to access remote files (not necessarily the same file), they all realize that the local copy of gmt_data_server.txt is too old and must be refreshed. So then the mad dash of being the first to do this starts among competing processes. What seems to happen is that they are clobbering each other and sometimes we are left with a zero-size gmt_data_server.txt file. This file now has a fresh date, so it is no longer downloaded, but it has no contents so any remote grids will fail to be aquired. This problems seems particulary bad when I run ctest with 20+ cores and that file is due for a refresh.
This PR tries to use the lockfile mechanism we have used for years for the gmt.history file (to protect them from being clobbered with your run process1 | process2, for instance. Except here, the file is acquired inside libcurl. The solution explored in this PR is to start the download sequence as outlined in #3723. So far so good. I did two experiments:
I hope @joa-quim and @seisman can give this branch a spin and see if they discovered any issues. It already seems to me it works better than the no-lock release we have. I added WIP so that we dont merge until we test a bit more. Comments welcome.