Add an API to list changes to quarantine state of media by turt2live · Pull Request #19558 · element-hq/synapse

turt2live · 2026-03-14T00:24:00Z

Fixes #19352

(See issue for history of this feature and previous PRs)

First, a naive implementation of the endpoint was introduced, but it quickly ran into performance issues on query and long startup times, leading to its removal. It also didn't actually work, and would fail to expose media when it was "unquarantined", so a partial fix was attempted, where the suggested direction is to use a stream instead of a timestamp column.

This PR re-introduces the API building on the previous feedback:

Adds a stream which tracks when media becomes (un)quarantined.
Runs a background update to capture already-quarantined media.
Adds a new admin API to return rows from the stream table.

We track both quarantine and unquarantine actions in the stream to allow downstream consumers to process the records appropriately. Namely, to allow our Synapse exchange in HMA to remove hashes for unquarantined media.

Pull Request Checklist

Pull request is based on the develop branch
Pull request includes a changelog file. The entry should:
- Be a short description of your change which makes sense to users. "Fixed a bug that prevented receiving messages from other servers." instead of "Moved X method from EventStore to EventWorkerStore.".
- Use markdown where necessary, mostly for code blocks.
- End with either a period (.) or an exclamation mark (!).
- Start with a capital letter.
- Feel free to credit yourself, by adding a sentence "Contributed by @github_username." or "Contributed by [Your Name]." to the end of the entry.
Code style is correct (run the linters)

Sticky events was used as reference for creating this.

Just something I noticed while working on #19558 We start the function by setting `total_media_quarantined` to zero, then we do work on the `media_ids`, add the number affected, zero it out (**bug**), do work on `hashes`, add the number of affected rows, then return `total_media_quarantined`. ### Pull Request Checklist  * [x] Pull request is based on the develop branch * [x] Pull request includes a [changelog file](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#changelog). The entry should: - Be a short description of your change which makes sense to users. "Fixed a bug that prevented receiving messages from other servers." instead of "Moved X method from `EventStore` to `EventWorkerStore`.". - Use markdown where necessary, mostly for `code blocks`. - End with either a period (.) or an exclamation mark (!). - Start with a capital letter. - Feel free to credit yourself, by adding a sentence "Contributed by @github_username." or "Contributed by [Your Name]." to the end of the entry. * [x] [Code style](https://element-hq.github.io/synapse/latest/code_style.html) is correct (run the [linters](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#run-the-linters))

docs/admin_api/media_admin_api.md

synapse/storage/databases/main/room.py

MadLittleMods · 2026-03-23T21:00:30Z

synapse/rest/admin/media.py

+    async def on_GET(self, request: SynapseRequest) -> tuple[int, JsonDict]:
+        await assert_requester_is_admin(self.auth, request)
+
+        from_id = parse_integer(request, "from", default=0)


It looks like you can paginate from the beginning of time which may be desirable.

Someone might only care about catching up on the stream from now, looking forwards. With the current setup, they would have to paginate years of history or guess stream ID's which are meant to be opaque (not guessed).

Since this is an admin thing, they could manually lookup the current value in the database table 🤷 Perhaps this only requires a little SQL snippet on how to do this lookup in our documentation.

I would normally agree, though unfortunately this PR is tied to relatively limited availability to expand the scope. A future PR can expand the capabilities of this API as needed.

synapse/storage/schema/main/delta/94/03_quarantined_media_tracking.sql

MadLittleMods · 2026-03-23T22:42:32Z

synapse/storage/databases/main/room.py

+                ORDER BY media_origin, media_id
+                LIMIT ? OFFSET ?
+                """,
+                (batch_size, last_row_num),


I fear using OFFSET being extremely slow when this table grows to a bunch of rows.

I don't have experience using OFFSET in Postgres. I have experienced problems with MongoDB for example.

My first instinct would be to use media_id if order doesn't matter.

We use OFFSET in other background updates, so I'm not worried.

I also think some of those other background updates may be suspect (bad performance). I will need more opinions or knowledge

fwiw, we (the consumers of this API) aren't concerned with it being slow to populate. It can take months if it needs to. If it's so bad performance-wise that it'll likely cause a production outage though, that's a different story.

I think my suspicions with OFFSET are applicable, https://use-the-index-luke.com/no-offset

These queries will slow to a crawl once you start paginating thousands/millions of rows out and cause undue load on the database.

I'll need some direction on what changes are required here. I'm fine with it being slow as a one-time operation, but I also don't want to wake anyone up at 2am for it.

Potential solution mentioned in the first reply (using media_id) and explained in https://use-the-index-luke.com/no-offset as well. Example: WHERE media_id > ? ORDER BY media_id and record the last media_id in the progress

To clarify: is removing OFFSET a requirement for getting this PR merged?

MadLittleMods · 2026-03-23T22:42:39Z

synapse/storage/databases/main/room.py

+    async def _flag_existing_quarantined_media(
+        self, progress: JsonDict, batch_size: int
+    ) -> int:


Why do we care about filling in historical data for quarantined_media_changes?

Resolved without reply

I addressed this in the diff below this thread. We want historical data because we need it for the use case.

This is the closest thing I can find:

synapse/synapse/storage/databases/main/room.py

Lines 193 to 196 in b5eafbc

# Register a background update to flag already-quarantined media in the quaranine

# media changes table. This is to populate the API endpoint which consumes the

# table with initial data that callers expect (namely, a list of currently

# quarantined media).

Which doesn't explain why we care. Please expand on the use case

The why is because the callers expect to receive data. I'm not sure how much clearer I can be - please suggest precise wording you'd like to see.

MadLittleMods · 2026-03-23T22:54:35Z

synapse/storage/databases/main/room.py

+                FROM remote_media_cache
+                WHERE quarantined_by IS NOT NULL
+
+                ORDER BY media_origin, media_id


indeed. Is this something I'm expected to address?

Yes, although I think we can switch to something better -> #19558 (comment)

Ordering by media_id does not change the query plan. It'll still sequential scan both tables because of WHERE quarantined_by IS NOT NULL. Adding an index to that column has historically caused different problems (making media unavailable on matrix.org for long enough that Backend killed the query and backed out the migration).

(query plan not disclosed publicly because it shows private information about quarantined media on matrix.org)

Do we even need to do this? -> #19558 (comment)

(query plan not disclosed publicly because it shows private information about quarantined media on matrix.org)

Just need to sanitize a few media_id? What's the real problem?

The query plan doesn't show media IDs, it shows the count of quarantined media.

Cross-link for discussing sensitivity of quarantined media count: #19558 (comment)

For clarity: I'm considering this the same as #19558 (comment) - we the consumers aren't concerned with speed here, but if there's likely to be a production incident then we'll fix it here.

MadLittleMods · 2026-03-23T22:56:03Z

synapse/storage/databases/main/room.py

+                SELECT NULL AS media_origin, media_id
+                FROM local_media_repository
+                WHERE quarantined_by IS NOT NULL
+
+                UNION
+
+                SELECT media_origin, media_id
+                FROM remote_media_cache
+                WHERE quarantined_by IS NOT NULL


We might as well handle these separately? Does order matter? I assume not

If there's benefit to doing it separately, sure. I don't believe there is. Ordering can't be done on the queries in a union.

Easier to reason about. Smaller transactions.

This might happen anyway if we start paginating by media_id in each table anyway.

How much does this thread block the PR from merging?

synapse/storage/databases/main/room.py

turt2live · 2026-03-31T02:57:07Z

@MadLittleMods this should be ready for review. Apologies if it wasn't clarified earlier, but this PR is tied to a strict timebox on my end - if major changes are needed, we'll have to coordinate our teams to make that happen.

I've resolved threads which I believe are resolved. Unresolved threads appear to need your reply. Due to the number of comments on this PR, I won't see comments in resolved threads - please open new threads rather than un-resolving them if they need my attention.

synapse/rest/admin/media.py

MadLittleMods · 2026-03-31T16:07:01Z

synapse/rest/admin/media.py

+        # return the `from` value.
+        next_batch = changes[-1].stream_id if len(changes) > 0 else from_id
+
+        return HTTPStatus.OK, {"next_batch": next_batch, "rows": rows}


Potential better name

Suggested change

return HTTPStatus.OK, {"next_batch": next_batch, "rows": rows}

return HTTPStatus.OK, {"next_batch": next_batch, "changes": serialized_changes}

It's a non-trivial amount of work to rename the JSON field that's returned - how important is it to call it something other than rows?

MadLittleMods · 2026-03-31T16:08:00Z

synapse/rest/admin/media.py

+        # `from` is exclusive, so don't +1 this. We also know the last record will have
+        # the highest stream ID, so use that one. If there aren't any records, just
+        # return the `from` value.
+        next_batch = changes[-1].stream_id if len(changes) > 0 else from_id


Suggested change

next_batch = changes[-1].stream_id if len(changes) > 0 else from_id

next_batch = changes[-1].stream_id if len(changes) > 0 else to_id

(comment also needs updating)

Previous discussion, #19558 (comment)

This suggestion has the potential to introduce a bug where records could end up being lost. When there's no changes, from_id is a more reliable stream position than whatever to_id is.

to_id will be the current position of the worker.

If we didn't find any rows between from_id and to_id, then it's safe to return to_id

If from_id is in the future though, then we'd be going backwards with to_id. That is undesirable.

On a scale of 1 to 10, where 10 is "this cannot merge without this change", where does this thread sit?

synapse/storage/databases/main/room.py

synapse/replication/tcp/streams/_base.py

synapse/config/workers.py

synapse/storage/databases/main/room.py

MadLittleMods · 2026-03-31T17:29:55Z

tests/rest/admin/test_media.py

+        # We expect to continue from `from` because we have no rows
+        self.assertEqual(0, channel.json_body["next_batch"])


Language needs adjusting once code is updated to use to_id (see other discussion)

tests/rest/admin/test_media.py

changelog.d/19558.feature

docs/admin_api/media_admin_api.md

Co-authored-by: Eric Eastwood <madlittlemods@gmail.com>

This reverts commit 6688723.

This reverts commit dbc3445.

This reverts commit b755aae.

This reverts commit 519512c.

This reverts commit b755aae.

turt2live mentioned this pull request Mar 14, 2026

Fix zeroing out remote quarantined media count #19559

Merged

3 tasks

turt2live added 2 commits March 16, 2026 13:23

Create a stream to track quarantine state changes

9a1c92f

Sticky events was used as reference for creating this.

Insert into quarantined media stream upon changes

c3a0e4a

turt2live force-pushed the travis/list-quarantined-media-mk2 branch from e76dc83 to 6328e78 Compare March 16, 2026 19:27

turt2live added 3 commits March 16, 2026 13:31

Add admin API to access stream data

7386f04

Add background update to insert existing rows

0f782ce

changelog

f1a35fa

turt2live force-pushed the travis/list-quarantined-media-mk2 branch from 258768e to f1a35fa Compare March 16, 2026 19:32

Merge branch 'develop' into travis/list-quarantined-media-mk2

780f13a

turt2live marked this pull request as ready for review March 16, 2026 19:34

turt2live requested a review from a team as a code owner March 16, 2026 19:34

MadLittleMods added A-Admin-API A-Media-Repository labels Mar 16, 2026

turt2live and others added 3 commits March 16, 2026 14:00

fix out of bounds on max

a78c03c

Attempt to fix linting

7963593

bump for ci

d77a76d

turt2live mentioned this pull request Mar 16, 2026

Update Synapse quarantine fetching to use new API matrix-org/hma-matrix#10

Merged

Merge branch 'develop' into travis/list-quarantined-media-mk2

2ef406e

MadLittleMods added the A-Abuse label Mar 23, 2026

MadLittleMods reviewed Mar 23, 2026

View reviewed changes

turt2live and others added 9 commits March 30, 2026 10:50

Merge branch 'develop' into travis/list-quarantined-media-mk2

86a6a15

Merge branch 'develop' into travis/list-quarantined-media-mk2

62261e0

Add comments

d743711

Use current token for stream, requiring writer configuration

81ce22c

Add extra safety

282e670

Split and expand tests

616e9c8

Attempt to fix linting

914e252

bump ci

090e220

Move schema deltas

cb71d46

turt2live requested a review from MadLittleMods March 31, 2026 02:57

MadLittleMods mentioned this pull request Mar 31, 2026

Support sending and receiving MSC4354 Sticky Event metadata. #19365

Merged

MadLittleMods reviewed Mar 31, 2026

View reviewed changes

changelog.d/19558.feature Show resolved Hide resolved

MadLittleMods reviewed Mar 31, 2026

View reviewed changes

docs/admin_api/media_admin_api.md Show resolved Hide resolved

Apply changes from review comments

d288ef5

turt2live requested a review from MadLittleMods March 31, 2026 19:23

turt2live and others added 2 commits March 31, 2026 19:27

Attempt to fix linting

471b3dc

Merge branch 'develop' into travis/list-quarantined-media-mk2

b5eafbc

MadLittleMods removed their request for review March 31, 2026 21:19

turt2live and others added 18 commits April 1, 2026 15:54

Record quarantine changes in more sites

373ed83

spelling

e5791a3

Apply suggestions from code review

067f659

Co-authored-by: Eric Eastwood <madlittlemods@gmail.com>

Attempt to fix linting

99b4bf2

bump ci

5a4ac32

Use Token type

5eac826

Add more token stuff

b755aae

split tests, again

c157697

Use multi stream tokens to fix types?

dbc3445

Attempt to fix linting

6688723

Revert "Attempt to fix linting"

237f0a6

This reverts commit 6688723.

Revert "Use multi stream tokens to fix types?"

190bda8

This reverts commit dbc3445.

Partial revert "Add more token stuff"

ef63a72

This reverts commit b755aae.

we do need this though

519512c

Revert "we do need this though"

c36311a

This reverts commit 519512c.

Revert "Add more token stuff"

8d6d9c8

This reverts commit b755aae.

Move stream wait to servlet I guess

2de3744

Remove excess change from lint fixing action

2325d9b

turt2live requested a review from MadLittleMods April 2, 2026 00:29

	# Register a background update to flag already-quarantined media in the quaranine
	# media changes table. This is to populate the API endpoint which consumes the
	# table with initial data that callers expect (namely, a list of currently
	# quarantined media).

	return HTTPStatus.OK, {"next_batch": next_batch, "rows": rows}
	return HTTPStatus.OK, {"next_batch": next_batch, "changes": serialized_changes}

	next_batch = changes[-1].stream_id if len(changes) > 0 else from_id
	next_batch = changes[-1].stream_id if len(changes) > 0 else to_id

		# We expect to continue from `from` because we have no rows
		self.assertEqual(0, channel.json_body["next_batch"])

Conversation

turt2live commented Mar 14, 2026 • edited by MadLittleMods Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request Checklist

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MadLittleMods Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MadLittleMods Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

turt2live Mar 31, 2026 • edited by MadLittleMods Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

turt2live commented Mar 31, 2026

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

turt2live commented Mar 14, 2026 •

edited by MadLittleMods

Loading

MadLittleMods Mar 31, 2026 •

edited

Loading

MadLittleMods Mar 31, 2026 •

edited

Loading

turt2live Mar 31, 2026 •

edited by MadLittleMods

Loading