-
-
Notifications
You must be signed in to change notification settings - Fork 5.7k
All: Fixes #6517: Prevent Joplin sync ignoring changes when syncing with an external sync service concurrently for file system and WebDAV sync #13054
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…l sync service concurrently
Is thew new setting really necessary? What do you mean exactly by files "not readable by Joplin"? |
Any files present which are not created by Joplin. |
|
Honestly, instead of adding the setting, wouldn't it be better to just throw an error and ask those users delete such items? I mean, as far as I am aware, there is no valid reason to fool around the sync target folder and add unrelated things in there manually. I would assume only a tiny fraction of Joplin users would do such thing anyway... or at least I hope so. 😅 |
|
Didn't want to mess with the existing behaviour too much, as I wouldn't want this to generate more support tickets. To me it made sense to have no change in behaviour by default, which has a safeguard for rookie users. Then advanced users can enable this new setting, in the advanced section of sync settings, so they can have better support for Syncthing and other external sync tools. But we'll see what Laurent thinks. |
|
Thinking about this again, Syncthing also creates its own |
Additional directories in the root Joplin dir are ok. But additional files will be a problem. Possible options to exclude these external files would be to only check for files with names that are 32 chars not including the extension, or only checking files with the .md extension |
I think this would be ideal (because if you put your own MD files inside this particular folder, you really are asking for trouble 😅). |
Looks like there is an existing function which checks both that the file is 32 chars and has an md extension. That is pretty specific, so with that filtering in place I think now it does make the new setting redundant, so I've removed it. I have updated the PR description to reflect this |
…nores-files-old-timestamps
|
I have now found a way to also deal with fixing updates to server objects not getting synced, which involves a toggleable alteration to the basic delta algorithm which is also backwards compatible with the old one, and with existing Joplin clients. So if this gets merged, it will fix the issue in it's entirety 😃 |
|
Thank you for working on this but the fact that it's a rather technical setting means very few people will actually use this, yet the change brings a lot of extra complexity in terms of maintenance. I'm not sure what to do here. I guess it's not possible to enable this by default? |
I also wanted to ask the same question. For clarity, I have been running Joplin with this code for about 11 days now, and everything seems to be working as expected. @mrjo118 Is there any reason or circumstances where one would be advised not to enable this setting? Filesystem sync is broken without it, so it is basically required there. If that is indeed the case, is there any actual reason to even have an option to disable the new behaviour? |
|
@laurent22 @tomasz1986 Functionally I believe that the new algorithm is sound, regardless of whether using a mirrored sync target or a single source of truth. In the case of using a hosted webdav server, the new algorithm is as good as the existing one, as the limitation around the timestamp precision also affects the existing algorithm. My concern is that enabling the new algorithm by default for everyone using file system sync and webdav is going to cause disruption for users with large data sets, because a full rescan is required on initial enablement, in order to obtain the modified timestamps of each file on file system of the server. In particular, for users of file system sync on Android, this could be a very slow process due to severe performance limitations with Android SAF introduced in Android 11+. I do think this would be ok to be enabled by default for new instances of Joplin or new profiles though. @laurent22 what do you think about if I added some logic to default the new setting to true, but explicitly set it to false if file system / webdav sync is already setup on the initial migration of the new db schema? |
|
Just my opinion, but taking all that into consideration, I would still like to suggest to just enable it for everyone (and possibly even remove the setting) regardless of the slow scan after the upgrade (which, if I understand correctly, will be required only once). This is because if the setting is hidden under "advanced configuration", the majority of users won't even know it exists, and the issue will still affect them (and in many cases, without them even realising that something is wrong, leading to conflicted copies or accusing Joplin and its synchronisation of being "unreliable" 😅). |
|
Yes, the rescan is only required once. For file system sync, I would be open to forcing the new sync algorithm if Laurent is ok with that. However for webdav sync there is probably quite a large userbase, the majority of which will receive no gain from this change. Maybe I could enforce the new algorithm if the target is webdav and the host is localhost, otherwise use the existing algorithm? Then either make the new setting apply to file system sync only and default to enabled for new profiles only, or remove the setting entirely and enforce the new algorithm for everyone using file system sync, depending on what @laurent22 is comfortable with |
… file system sync
|
@laurent22 I have now updated the PR as follows:
If you would prefer to remove the setting altogether and force a full re-scan of items on upgrade for file system sync users, please let me know |
|
I would really like to see this merged, because as of now, file-based sync (using Syncthing) often does not manage to sync photos in note attachments. I guess this happens mostly with large attachments and generally not with plain-text notes. This IMHO is a pretty serious issue on an otherwise great note-taking app as sync cannot be trusted to synchronize everything - once a photo has been ignored, no clicking of the Sync button on either Android or desktop Joplin will help - only a full database rebuild causes it to recognize something is missing. |
…files-old-timestamps
…files-old-timestamps
@laurent22 The PR is much bigger now, but it's mostly test units in file-api.test. I've added test coverage for both the existing and new basic delta algorithm to increase confidence (there was zero test coverage for the existing algorithm, so this is an improvement). The logic for the new algorithm has not changed since the first version I implemented, which @tomasz1986 has been using for over a month now via the built binaries which I sent him |
|
I can confirm that I have been testing this PR on 4 desktop PCs and 1 Android device for about a month now, and everything seems to be working as expected (without the need for any scripts that change file timestamps into the future, etc. as is the case without this commit). |
|
How much of a problem is it to do a full sync for existing profiles? I wonder if we could enable it for them too since it's a one-off and anyway if someone is using filesystem sync they should probably enable the feature, even if it means having to wait a bit the first time |
|
(by the way there are a few conflicts) |
The only problem is them having to wait the first time, nothing else. Based on that the change is now targeted to only affect the users who will reap benefit from it, I think it makes sense to remove the configuration setting, so it is enabled for all file system sync users. I will change this and sort out the conflicts. |
|
Changes now done |
Background
When syncing to the file system sync in combination with an external sync service such as Syncthing, new files with a timestamp older than the last sync timestamp may get silently ignored (instead of being pulled into Joplin) if the external sync service syncs at the same time as the Joplin sync.
This is due to the basic delta algorithm's reliance on file timestamps, which presents a problem described well in this post https://discourse.joplinapp.org/t/photos-sometimes-not-syncing-syncthing-android-mac/47083/4:
This PR resolves the issue by introducing a modified sync algorithm for the basic delta. This is now utilised by WebDAV sync where the url is a localhost url, and by file system sync where a new 'Detect changes based on any timestamp change' setting is enabled. This setting is enabled by default for new profiles or Joplin installations, but if the user is already using file sync at the point of upgrading, it will be set to false to avoid a forced re-scan of all items.
When use for the first time, it will need to do a re-scan of all local notes in order to store the file timestamps of all the sync target items. However, once this is done, items will not be re-scanned unless the timestamp of a file in the sync target directory has changed. When using the new algorithm, it is backwards compatible if the setting is turned back off, and is compatible with other clients which do not have the setting enabled or are on older version of Joplin without the new setting. The requirement for a full or partial re-scan on enabling of the setting, also applies when re-enabling the setting if switching back and forth between enabled and disabled and syncing changes between toggling the setting.
This fixes the longstanding issue #6517
Changes
At a basic level it works like so:
Limitations
The only way a change can get ignored by the sync using this new algorithm is if 2 Joplin clients sync conflicting changes to the same file at exactly the same timestamp on the filesystem (this assumes that the sync service will sync the modified time of files across all clients, which is the case for Syncthing). However, the reliability of the new algorithm is constrained by the filesystem of the storage used for the sync target directory. At precision's of 1 ms - 10 ms this is very unlikely to happen, in comparison to the likelihood of changes getting missed when using the old algorithm. At a precision of seconds however (used on some older filesystems; for modification time, the precision is just 2 seconds for FAT32), this may be less reliable. On desktop, this could be easily solved by ensuring that the drive used for the sync target is formatted to a filesystem with a high timestamp precision, but for mobile, using onboard storage may be required to get the best reliability in some cases. However, newer versions of Android support exFAT for external storage, which can have up to 10 ms precision, but this could depend on the implementation of the file storage driver whether it is utilised.
On Windows syncing with a dir on an NTFS formatted drive, when checking the values in the remote_item_updated_time of the sqlite db I could see that the modified_date of the files have a 1 ms precision. On Android 16 on emulator syncing to on board storage, when checking the values in the remote_item_updated_time of the sqlite db I could see that the modified_date of the files have a 2 ms precision. The precision does seem to be OS dependent in addition to file system dependent however, as when I tested on an older Android 10 (real) device, the timestamp accuracy for file system sync to both internal and external (formatted to FAT32) storage was at least 1 second (I expect it was 2 seconds, at least in the case of the FAT32 storage).
I asked AI, and it looks like if you’re on Android 11 or above you can get millisecond precision for the device storage at least. So Android 10 is the last Android version to have that constraint at OS level (which is 1 second timestamp precision). If using WebDAV, the precision of the modified timestamp is also dependant on the WebDAV server being used. For example, when using RoundSync to serve a WebDAV server pointing to a Box.com account, the available precision was only 1 second, compared to 1 millisecond if using file system sync to internal storage on the same device. Further still, if using something like RoundSync or rclone to sync to a third party cloud service, the cloud service may also limit the precision of the modified timestamp.
In spite of the potential for the precision of the timestamp to be limited however, even with a 1 second precision, I suspect this is still likely to be significantly more reliable than using the current algorithm in combination with a mirrored sync target, because the risk of changes not being synced is only if 2 Joplin clients sync the same item at exactly the same second, rather than being dependent on whether a Joplin sync completes on one client before the external sync service has synced changes (not necessarily for the same items) originating from another client.
Testing
Synchronizer: BasicDelta: Report: {"timestamp":1756248482745,"older":27,"newer":0,"equal":1}
I have verified with the new code, the log entry returns a count of only the valid md files in the root directory (split between older, newer and equal). This is because any additional files, including the mandatory info.json, are now excluded from the output of the basic delta. When the new setting is enabled, the count will be distributed differently across older, newer and equal, but the total count is equivalent, and the log entry includes '(enhanced)' text in it
Videos:
These demonstrate syncing between the code in this PR as a dev build, and the Joplin 3.4.5 production build
Part 1:
electron_BGYcgejvt3-00.00.00.000-00.03.08.681-seg1.mp4
Part 2:
electron_BGYcgejvt3-00.03.08.681-00.06.15.446-seg2.mp4