Store hashes of media files, and allow quarantining by hash.#18277
Store hashes of media files, and allow quarantining by hash.#18277reivilibre merged 18 commits intoelement-hq:developfrom
Conversation
reivilibre
left a comment
There was a problem hiding this comment.
nothing structurally wrong with this; glad to take the hash in a streaming fashion so that's good.
Thanks!
reivilibre
left a comment
There was a problem hiding this comment.
seems pretty close now, thanks :)
| count = self._quarantine_local_media_txn(txn, local_mxcs, quarantined_by) | ||
| count += self._quarantine_remote_media_txn(txn, remote_mxcs, quarantined_by) |
There was a problem hiding this comment.
not urgent, but maybe surprising: painfully, this won't quarantine matching media if one is local and the other remote
There was a problem hiding this comment.
I think I'm not understanding this one, did the diff change from underneath you or am I confused?
There was a problem hiding this comment.
Let's say we have a local file L with hash 1234 and a remote file example.org/R with the same hash 1234.
If I request the quarantining of local file L, then you'd hope example.org/R to also be quarantined, but I don't think this happens currently?
There was a problem hiding this comment.
haaah, excellent point. This is the problem I suppose of breaking this up into two steps.
I suppose the ideal here is we break up hash collection into a function, and then quarantining into another function. Let's do that, as we don't often get a chance to touch this code and it's worth doing right.
reivilibre
left a comment
There was a problem hiding this comment.
extra db perf things I noticed. I know it's only an admin API, but still, I guess we may as well use the helpers we have if we can
| row = txn.fetchone() | ||
| if row: | ||
| if row[0]: | ||
| hashes.add(row[0]) | ||
| else: | ||
| non_hashed_media_ids.add((row[1], row[2])) |
There was a problem hiding this comment.
probably intended for row in txn: instead of the fetchone and if.
For readability, also consider unpacking the row e.g. for sha256, media_id, media_origin in txn:
(I think this would also reveal that this current implementation is swapping media_id and media_origin?)
There was a problem hiding this comment.
Using for x,y in txn now :)
| count = self._quarantine_local_media_txn(txn, local_mxcs, quarantined_by) | ||
| count += self._quarantine_remote_media_txn(txn, remote_mxcs, quarantined_by) |
There was a problem hiding this comment.
Let's say we have a local file L with hash 1234 and a remote file example.org/R with the same hash 1234.
If I request the quarantining of local file L, then you'd hope example.org/R to also be quarantined, but I don't think this happens currently?
|
( I think complement needs a 🦶 ) |
reivilibre
left a comment
There was a problem hiding this comment.
Looks good, thank you! :D
Essentially document the change in behaviour in #18277 ### Pull Request Checklist <!-- Please read https://element-hq.github.io/synapse/latest/development/contributing_guide.html before submitting your pull request --> * [x] Pull request is based on the develop branch * [x] Pull request includes a [changelog file](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#changelog). The entry should: - Be a short description of your change which makes sense to users. "Fixed a bug that prevented receiving messages from other servers." instead of "Moved X method from `EventStore` to `EventWorkerStore`.". - Use markdown where necessary, mostly for `code blocks`. - End with either a period (.) or an exclamation mark (!). - Start with a capital letter. - Feel free to credit yourself, by adding a sentence "Contributed by @github_username." or "Contributed by [Your Name]." to the end of the entry. * [x] [Code style](https://element-hq.github.io/synapse/latest/code_style.html) is correct (run the [linters](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#run-the-linters))
Builds on NetBSD 10 amd64, and builds/tests-ok on NetBSD 9 amd64 using dependencies from 2025Q2. NB: A security update to synapse is scheduled for July 22. Consult https://matrix.org/blog/2025/07/security-predisclosure/ for further details. Those running synapse in production may wish to update to 1.134.0 to reduce the magnitude of change when updating to the July 22 version (although that will be a big update regardless). Note that the usual pkgsrc pre-commit test is upgrading from the current pkgsrc version and briefly checking operation. Therefore, not upgrading has a theoretical risk of encountering a 1.127.1 to 1.135.0 update bug when 1.127.1 to 134.0 and 1.134.0 to 1.135.0 are ok. # Synapse 1.134.0 (2025-07-15) - Support for [MSC4235](matrix-org/matrix-spec-proposals#4235): `via` query param for hierarchy endpoint. Contributed by Krishan (@kfiven). ([\#18070](element-hq/synapse#18070)) - Add `forget_forced_upon_leave` capability as per [MSC4267](matrix-org/matrix-spec-proposals#4267). ([\#18196](element-hq/synapse#18196)) - Add `federated_user_may_invite` spam checker callback which receives the entire invite event. Contributed by @tulir @ Beeper. ([\#18241](element-hq/synapse#18241)) # Synapse 1.133.0 (2025-07-01) - Add support for the [MSC4260 user report API](matrix-org/matrix-spec-proposals#4260). ([\#18120](element-hq/synapse#18120)) # Synapse 1.132.0 (2025-06-17) - Add support for [MSC4155](matrix-org/matrix-spec-proposals#4155) Invite Filtering. ([\#18288](element-hq/synapse#18288)) - Add experimental `user_may_send_state_event` module API callback. ([\#18455](element-hq/synapse#18455)) - Add experimental `get_media_config_for_user` and `is_user_allowed_to_upload_media_of_size` module API callbacks that allow overriding of media repository maximum upload size. ([\#18457](element-hq/synapse#18457)) - Add experimental `get_ratelimit_override_for_user` module API callback that allows overriding of per-user ratelimits. ([\#18458](element-hq/synapse#18458)) - Pass `room_config` argument to `user_may_create_room` spam checker module callback. ([\#18486](element-hq/synapse#18486)) - Support configuration of default and extra user types. ([\#18456](element-hq/synapse#18456)) - Successful requests to `/_matrix/app/v1/ping` will now force Synapse to reattempt delivering transactions to appservices. ([\#18521](element-hq/synapse#18521)) - Support the import of the `RatelimitOverride` type from `synapse.module_api` in modules and rename `messages_per_second` to `per_second`. ([\#18513](element-hq/synapse#18513)) # Synapse 1.131.0 (2025-06-03) - Add `msc4263_limit_key_queries_to_users_who_share_rooms` config option as per [MSC4263](matrix-org/matrix-spec-proposals#4263). ([\#18180](element-hq/synapse#18180)) - Add option to allow registrations that begin with `_`. Contributed by `_` (@hex5f). ([\#18262](element-hq/synapse#18262)) - Include room ID in response to the [Room Deletion Status Admin API](https://element-hq.github.io/synapse/latest/admin_api/rooms.html#status-of-deleting-rooms). ([\#18318](element-hq/synapse#18318)) - Add support for calling Policy Servers ([MSC4284](matrix-org/matrix-spec-proposals#4284)) to mark events as spam. ([\#18387](element-hq/synapse#18387)) # Synapse 1.130.0 (2025-05-20) - Add an Admin API endpoint `GET /_synapse/admin/v1/scheduled_tasks` to fetch scheduled tasks. ([\#18214](element-hq/synapse#18214)) - Add config option `user_directory.exclude_remote_users` which, when enabled, excludes remote users from user directory search results. ([\#18300](element-hq/synapse#18300)) - Add support for handling `GET /devices/` on workers. ([\#18355](element-hq/synapse#18355)) # Synapse 1.129.0 (2025-05-06) - Add `passthrough_authorization_parameters` in OIDC configuration to allow passing parameters to the authorization grant URL. ([\#18232](element-hq/synapse#18232)) - Add `total_event_count`, `total_message_count`, and `total_e2ee_event_count` fields to the homeserver usage statistics. ([\#18260](element-hq/synapse#18260)) # Synapse 1.128.0 (2025-04-08) - Add an access token introspection cache to make Matrix Authentication Service integration ([MSC3861](matrix-org/matrix-spec-proposals#3861)) more efficient. ([\#18231](element-hq/synapse#18231)) - Add background job to clear unreferenced state groups. ([\#18254](element-hq/synapse#18254)) - Hashes of media files are now tracked by Synapse. Media quarantines will now apply to all files with the same hash. ([\#18277](element-hq/synapse#18277), [\#18302](element-hq/synapse#18302), [\#18296](element-hq/synapse#18296))
chat/matrix-synapse: Update package in anticipation of security fix
Revisions pulled up:
- chat/matrix-synapse/Makefile 1.112
- chat/matrix-synapse/PLIST 1.59
- chat/matrix-synapse/cargo-depends.mk 1.27
- chat/matrix-synapse/distinfo 1.80
---
Module Name: pkgsrc
Committed By: gdt
Date: Thu Jul 17 11:24:44 UTC 2025
Modified Files:
pkgsrc/chat/matrix-synapse: Makefile PLIST cargo-depends.mk distinfo
Log Message:
chat/matrix-synapse: Update to 1.134.0
Builds on NetBSD 10 amd64, and builds/tests-ok on NetBSD 9 amd64 using
dependencies from 2025Q2.
NB: A security update to synapse is scheduled for July 22. Consult
https://matrix.org/blog/2025/07/security-predisclosure/
for further details.
Those running synapse in production may wish to update to 1.134.0 to
reduce the magnitude of change when updating to the July 22 version
(although that will be a big update regardless). Note that the usual
pkgsrc pre-commit test is upgrading from the current pkgsrc version
and briefly checking operation. Therefore, not upgrading has a
theoretical risk of encountering a 1.127.1 to 1.135.0 update bug when
1.127.1 to 134.0 and 1.134.0 to 1.135.0 are ok.
# Synapse 1.134.0 (2025-07-15)
- Support for [MSC4235](matrix-org/matrix-spec-proposals#4235): `via` query param for hierarchy endpoint. Contributed by Krishan (@kfiven).
([\#18070](element-hq/synapse#18070))
- Add `forget_forced_upon_leave` capability as per [MSC4267](matrix-org/matrix-spec-proposals#4267). ([\#18196](element-hq/synapse#18196))
- Add `federated_user_may_invite` spam checker callback which receives the entire invite event. Contributed by @tulir @ Beeper. ([\#18241](element-hq/synapse#18241))
# Synapse 1.133.0 (2025-07-01)
- Add support for the [MSC4260 user report API](matrix-org/matrix-spec-proposals#4260). ([\#18120](element-hq/synapse#18120))
# Synapse 1.132.0 (2025-06-17)
- Add support for [MSC4155](matrix-org/matrix-spec-proposals#4155) Invite Filtering. ([\#18288](element-hq/synapse#18288))
- Add experimental `user_may_send_state_event` module API callback. ([\#18455](element-hq/synapse#18455))
- Add experimental `get_media_config_for_user` and `is_user_allowed_to_upload_media_of_size` module API callbacks that allow overriding of media repository maximum upload size.
([\#18457](element-hq/synapse#18457))
- Add experimental `get_ratelimit_override_for_user` module API callback that allows overriding of per-user ratelimits. ([\#18458](element-hq/synapse#18458))
- Pass `room_config` argument to `user_may_create_room` spam checker module callback. ([\#18486](element-hq/synapse#18486))
- Support configuration of default and extra user types. ([\#18456](element-hq/synapse#18456))
- Successful requests to `/_matrix/app/v1/ping` will now force Synapse to reattempt delivering transactions to appservices. ([\#18521](element-hq/synapse#18521))
- Support the import of the `RatelimitOverride` type from `synapse.module_api` in modules and rename `messages_per_second` to `per_second`.
([\#18513](element-hq/synapse#18513))
# Synapse 1.131.0 (2025-06-03)
- Add `msc4263_limit_key_queries_to_users_who_share_rooms` config option as per [MSC4263](matrix-org/matrix-spec-proposals#4263).
([\#18180](element-hq/synapse#18180))
- Add option to allow registrations that begin with `_`. Contributed by `_` (@hex5f). ([\#18262](element-hq/synapse#18262))
- Include room ID in response to the [Room Deletion Status Admin API](https://element-hq.github.io/synapse/latest/admin_api/rooms.html#status-of-deleting-rooms).
([\#18318](element-hq/synapse#18318))
- Add support for calling Policy Servers ([MSC4284](matrix-org/matrix-spec-proposals#4284)) to mark events as spam.
([\#18387](element-hq/synapse#18387))
# Synapse 1.130.0 (2025-05-20)
- Add an Admin API endpoint `GET /_synapse/admin/v1/scheduled_tasks` to fetch scheduled tasks. ([\#18214](element-hq/synapse#18214))
- Add config option `user_directory.exclude_remote_users` which, when enabled, excludes remote users from user directory search results. ([\#18300](element-hq/synapse#18300))
- Add support for handling `GET /devices/` on workers. ([\#18355](element-hq/synapse#18355))
# Synapse 1.129.0 (2025-05-06)
- Add `passthrough_authorization_parameters` in OIDC configuration to allow passing parameters to the authorization grant URL. ([\#18232](element-hq/synapse#18232))
- Add `total_event_count`, `total_message_count`, and `total_e2ee_event_count` fields to the homeserver usage statistics. ([\#18260](element-hq/synapse#18260))
# Synapse 1.128.0 (2025-04-08)
- Add an access token introspection cache to make Matrix Authentication Service integration ([MSC3861](matrix-org/matrix-spec-proposals#3861)) more efficient.
([\#18231](element-hq/synapse#18231))
- Add background job to clear unreferenced state groups. ([\#18254](element-hq/synapse#18254))
- Hashes of media files are now tracked by Synapse. Media quarantines will now apply to all files with the same hash. ([\#18277](element-hq/synapse#18277),
[\#18302](element-hq/synapse#18302), [\#18296](element-hq/synapse#18296))
chat/matrix-synapse: Update package in anticipation of security fix
Revisions pulled up:
- chat/matrix-synapse/Makefile 1.112
- chat/matrix-synapse/PLIST 1.59
- chat/matrix-synapse/cargo-depends.mk 1.27
- chat/matrix-synapse/distinfo 1.80
---
Module Name: pkgsrc
Committed By: gdt
Date: Thu Jul 17 11:24:44 UTC 2025
Modified Files:
pkgsrc/chat/matrix-synapse: Makefile PLIST cargo-depends.mk distinfo
Log Message:
chat/matrix-synapse: Update to 1.134.0
Builds on NetBSD 10 amd64, and builds/tests-ok on NetBSD 9 amd64 using
dependencies from 2025Q2.
NB: A security update to synapse is scheduled for July 22. Consult
https://matrix.org/blog/2025/07/security-predisclosure/
for further details.
Those running synapse in production may wish to update to 1.134.0 to
reduce the magnitude of change when updating to the July 22 version
(although that will be a big update regardless). Note that the usual
pkgsrc pre-commit test is upgrading from the current pkgsrc version
and briefly checking operation. Therefore, not upgrading has a
theoretical risk of encountering a 1.127.1 to 1.135.0 update bug when
1.127.1 to 134.0 and 1.134.0 to 1.135.0 are ok.
# Synapse 1.134.0 (2025-07-15)
- Support for [MSC4235](matrix-org/matrix-spec-proposals#4235): `via` query param for hierarchy endpoint. Contributed by Krishan (@kfiven).
([\#18070](element-hq/synapse#18070))
- Add `forget_forced_upon_leave` capability as per [MSC4267](matrix-org/matrix-spec-proposals#4267). ([\#18196](element-hq/synapse#18196))
- Add `federated_user_may_invite` spam checker callback which receives the entire invite event. Contributed by @tulir @ Beeper. ([\#18241](element-hq/synapse#18241))
# Synapse 1.133.0 (2025-07-01)
- Add support for the [MSC4260 user report API](matrix-org/matrix-spec-proposals#4260). ([\#18120](element-hq/synapse#18120))
# Synapse 1.132.0 (2025-06-17)
- Add support for [MSC4155](matrix-org/matrix-spec-proposals#4155) Invite Filtering. ([\#18288](element-hq/synapse#18288))
- Add experimental `user_may_send_state_event` module API callback. ([\#18455](element-hq/synapse#18455))
- Add experimental `get_media_config_for_user` and `is_user_allowed_to_upload_media_of_size` module API callbacks that allow overriding of media repository maximum upload size.
([\#18457](element-hq/synapse#18457))
- Add experimental `get_ratelimit_override_for_user` module API callback that allows overriding of per-user ratelimits. ([\#18458](element-hq/synapse#18458))
- Pass `room_config` argument to `user_may_create_room` spam checker module callback. ([\#18486](element-hq/synapse#18486))
- Support configuration of default and extra user types. ([\#18456](element-hq/synapse#18456))
- Successful requests to `/_matrix/app/v1/ping` will now force Synapse to reattempt delivering transactions to appservices. ([\#18521](element-hq/synapse#18521))
- Support the import of the `RatelimitOverride` type from `synapse.module_api` in modules and rename `messages_per_second` to `per_second`.
([\#18513](element-hq/synapse#18513))
# Synapse 1.131.0 (2025-06-03)
- Add `msc4263_limit_key_queries_to_users_who_share_rooms` config option as per [MSC4263](matrix-org/matrix-spec-proposals#4263).
([\#18180](element-hq/synapse#18180))
- Add option to allow registrations that begin with `_`. Contributed by `_` (@hex5f). ([\#18262](element-hq/synapse#18262))
- Include room ID in response to the [Room Deletion Status Admin API](https://element-hq.github.io/synapse/latest/admin_api/rooms.html#status-of-deleting-rooms).
([\#18318](element-hq/synapse#18318))
- Add support for calling Policy Servers ([MSC4284](matrix-org/matrix-spec-proposals#4284)) to mark events as spam.
([\#18387](element-hq/synapse#18387))
# Synapse 1.130.0 (2025-05-20)
- Add an Admin API endpoint `GET /_synapse/admin/v1/scheduled_tasks` to fetch scheduled tasks. ([\#18214](element-hq/synapse#18214))
- Add config option `user_directory.exclude_remote_users` which, when enabled, excludes remote users from user directory search results. ([\#18300](element-hq/synapse#18300))
- Add support for handling `GET /devices/` on workers. ([\#18355](element-hq/synapse#18355))
# Synapse 1.129.0 (2025-05-06)
- Add `passthrough_authorization_parameters` in OIDC configuration to allow passing parameters to the authorization grant URL. ([\#18232](element-hq/synapse#18232))
- Add `total_event_count`, `total_message_count`, and `total_e2ee_event_count` fields to the homeserver usage statistics. ([\#18260](element-hq/synapse#18260))
# Synapse 1.128.0 (2025-04-08)
- Add an access token introspection cache to make Matrix Authentication Service integration ([MSC3861](matrix-org/matrix-spec-proposals#3861)) more efficient.
([\#18231](element-hq/synapse#18231))
- Add background job to clear unreferenced state groups. ([\#18254](element-hq/synapse#18254))
- Hashes of media files are now tracked by Synapse. Media quarantines will now apply to all files with the same hash. ([\#18277](element-hq/synapse#18277),
[\#18302](element-hq/synapse#18302), [\#18296](element-hq/synapse#18296))
chat/matrix-synapse: Update package in anticipation of security fix
Revisions pulled up:
- chat/matrix-synapse/Makefile 1.112
- chat/matrix-synapse/PLIST 1.59
- chat/matrix-synapse/cargo-depends.mk 1.27
- chat/matrix-synapse/distinfo 1.80
---
Module Name: pkgsrc
Committed By: gdt
Date: Thu Jul 17 11:24:44 UTC 2025
Modified Files:
pkgsrc/chat/matrix-synapse: Makefile PLIST cargo-depends.mk distinfo
Log Message:
chat/matrix-synapse: Update to 1.134.0
Builds on NetBSD 10 amd64, and builds/tests-ok on NetBSD 9 amd64 using
dependencies from 2025Q2.
NB: A security update to synapse is scheduled for July 22. Consult
https://matrix.org/blog/2025/07/security-predisclosure/
for further details.
Those running synapse in production may wish to update to 1.134.0 to
reduce the magnitude of change when updating to the July 22 version
(although that will be a big update regardless). Note that the usual
pkgsrc pre-commit test is upgrading from the current pkgsrc version
and briefly checking operation. Therefore, not upgrading has a
theoretical risk of encountering a 1.127.1 to 1.135.0 update bug when
1.127.1 to 134.0 and 1.134.0 to 1.135.0 are ok.
# Synapse 1.134.0 (2025-07-15)
- Support for [MSC4235](matrix-org/matrix-spec-proposals#4235): `via` query param for hierarchy endpoint. Contributed by Krishan (@kfiven).
([\#18070](element-hq/synapse#18070))
- Add `forget_forced_upon_leave` capability as per [MSC4267](matrix-org/matrix-spec-proposals#4267). ([\#18196](element-hq/synapse#18196))
- Add `federated_user_may_invite` spam checker callback which receives the entire invite event. Contributed by @tulir @ Beeper. ([\#18241](element-hq/synapse#18241))
# Synapse 1.133.0 (2025-07-01)
- Add support for the [MSC4260 user report API](matrix-org/matrix-spec-proposals#4260). ([\#18120](element-hq/synapse#18120))
# Synapse 1.132.0 (2025-06-17)
- Add support for [MSC4155](matrix-org/matrix-spec-proposals#4155) Invite Filtering. ([\#18288](element-hq/synapse#18288))
- Add experimental `user_may_send_state_event` module API callback. ([\#18455](element-hq/synapse#18455))
- Add experimental `get_media_config_for_user` and `is_user_allowed_to_upload_media_of_size` module API callbacks that allow overriding of media repository maximum upload size.
([\#18457](element-hq/synapse#18457))
- Add experimental `get_ratelimit_override_for_user` module API callback that allows overriding of per-user ratelimits. ([\#18458](element-hq/synapse#18458))
- Pass `room_config` argument to `user_may_create_room` spam checker module callback. ([\#18486](element-hq/synapse#18486))
- Support configuration of default and extra user types. ([\#18456](element-hq/synapse#18456))
- Successful requests to `/_matrix/app/v1/ping` will now force Synapse to reattempt delivering transactions to appservices. ([\#18521](element-hq/synapse#18521))
- Support the import of the `RatelimitOverride` type from `synapse.module_api` in modules and rename `messages_per_second` to `per_second`.
([\#18513](element-hq/synapse#18513))
# Synapse 1.131.0 (2025-06-03)
- Add `msc4263_limit_key_queries_to_users_who_share_rooms` config option as per [MSC4263](matrix-org/matrix-spec-proposals#4263).
([\#18180](element-hq/synapse#18180))
- Add option to allow registrations that begin with `_`. Contributed by `_` (@hex5f). ([\#18262](element-hq/synapse#18262))
- Include room ID in response to the [Room Deletion Status Admin API](https://element-hq.github.io/synapse/latest/admin_api/rooms.html#status-of-deleting-rooms).
([\#18318](element-hq/synapse#18318))
- Add support for calling Policy Servers ([MSC4284](matrix-org/matrix-spec-proposals#4284)) to mark events as spam.
([\#18387](element-hq/synapse#18387))
# Synapse 1.130.0 (2025-05-20)
- Add an Admin API endpoint `GET /_synapse/admin/v1/scheduled_tasks` to fetch scheduled tasks. ([\#18214](element-hq/synapse#18214))
- Add config option `user_directory.exclude_remote_users` which, when enabled, excludes remote users from user directory search results. ([\#18300](element-hq/synapse#18300))
- Add support for handling `GET /devices/` on workers. ([\#18355](element-hq/synapse#18355))
# Synapse 1.129.0 (2025-05-06)
- Add `passthrough_authorization_parameters` in OIDC configuration to allow passing parameters to the authorization grant URL. ([\#18232](element-hq/synapse#18232))
- Add `total_event_count`, `total_message_count`, and `total_e2ee_event_count` fields to the homeserver usage statistics. ([\#18260](element-hq/synapse#18260))
# Synapse 1.128.0 (2025-04-08)
- Add an access token introspection cache to make Matrix Authentication Service integration ([MSC3861](matrix-org/matrix-spec-proposals#3861)) more efficient.
([\#18231](element-hq/synapse#18231))
- Add background job to clear unreferenced state groups. ([\#18254](element-hq/synapse#18254))
- Hashes of media files are now tracked by Synapse. Media quarantines will now apply to all files with the same hash. ([\#18277](element-hq/synapse#18277),
[\#18302](element-hq/synapse#18302), [\#18296](element-hq/synapse#18296))
This PR makes a few radical changes to media. This now stores the SHA256 hash of each file stored in the database (excluding thumbnails, more on that later). If a set of media is quarantined, any additional uploads of the same file contents or any other files with the same hash will be quarantined at the same time.
Currently this does NOT:
This needs further refinement, but just about works. I'll take this out of draft when it's ready.
Pull Request Checklist
EventStoretoEventWorkerStore.".code blocks.(run the linters)