fix: provide remote servers a way to find out about an event created during the remote join handshake#19390
fix: provide remote servers a way to find out about an event created during the remote join handshake#19390FrenchGithubUser wants to merge 2 commits intoelement-hq:developfrom
Conversation
e29abe1 to
688aca5
Compare
|
I am submitting this PR as an employee of Famedly, who has signed the corporate CLA, and used my company email in the commit. |
688aca5 to
9483291
Compare
anoadragon453
left a comment
There was a problem hiding this comment.
Hi @FrenchGithubUser. As far as the Element Backend team is aware, Famedly has not yet signed the CCLA. However this is apparently currently in progress.
Holding off on review until that's resolved. Regardless, thank you for submitting your work upstream!
|
famedly has now signed the ccla :) |
|
@FrenchGithubUser I think to allow the CLA bot to let you through, your membership to the famedly organisation must be public If you don't want that to be the case, I can add you specifically to the list of allowed users, but making the org membership public is easier for us :) |
|
@sandhose I just updated the membership, should be public now! I didn't know this visibility could be changed :) |
|
@FrenchGithubUser could you update the branch (just pull from |
during the remote join handshake
9483291 to
952ebb5
Compare
|
@anoadragon453 done |
anoadragon453
left a comment
There was a problem hiding this comment.
Interesting solution - thanks for sending it upstream.
I'd really like to see a Complement test for this if possible, so we can verify that this fixes the problem. I think you'd just need to send an event into the room between the /make_join and /send_join requests.
Existing federated room join tests: https://github.com/matrix-org/complement/blob/main/tests/federation_room_join_test.go
| event, context = await self._on_send_membership_event( | ||
| origin, content, Membership.JOIN, room_id | ||
| ) | ||
| # Collect this now, the internal metadata of event(which should have it) doesn't |
There was a problem hiding this comment.
This comment is quite difficult to read.
And why not query this close to where it's used, below?
| if not dummy_event_sent: | ||
| # Did not find a valid user in the room, so remove from future attempts | ||
| # Exclusion is time limited, so the room will be rechecked in the future | ||
| # dependent on _DUMMY_EVENT_ROOM_EXCLUSION_EXPIRY | ||
| logger.info( | ||
| "Failed to send dummy event into room %s. Will exclude it from " | ||
| "future attempts until cache expires", | ||
| room_id, | ||
| ) | ||
| # This mapping is room_id -> time of last attempt(in ms) | ||
| self._rooms_to_exclude_from_dummy_event_insertion[room_id] = ( | ||
| self.clock.time_msec() | ||
| ) |
There was a problem hiding this comment.
Is it worth waiting an trying again if the room will move on in the meantime, and new events will likely be sent?
|
@anoadragon453 where would you like the complement test to be, in the complement repo or the synapse repo? |
|
@FrenchGithubUser as I believe this would be applicable to any homeserver implementation, the Complement repo itself would be best. Thanks! |
There was a problem hiding this comment.
Pull request overview
This PR aims to prevent events created during the /make_join → /send_join handshake from being “missed” by the joining remote server by injecting a dummy event which references the current forward extremities and is proactively federated.
Changes:
- Add a new post-remote-join path to create/send a dummy event with
internal_metadata.proactively_send=True. - On
/send_join, detect multiple forward extremities and trigger dummy event injection. - Add a bugfix changelog entry describing the behavior change.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 7 comments.
| File | Description |
|---|---|
synapse/handlers/message.py |
Adds _send_dummy_event_after_room_join and makes proactive-send configurable on dummy events. |
synapse/federation/federation_server.py |
After handling /send_join, checks forward extremities and triggers dummy event injection. |
changelog.d/19390.bugfix |
Documents the bugfix for handshake-created events being missed by newly joined servers. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
| forward_extremities = await self.store._get_forward_extremeties_for_room( | ||
| room_id, stream_ordering_of_join.get_max_stream_pos() | ||
| ) | ||
|
|
||
| if len(forward_extremities) > 1: | ||
| # The likelihood of this being used is extremely low, thus only build the handler | ||
| # when necessary. | ||
| _creation_handler = self.hs.get_event_creation_handler() | ||
| await _creation_handler._send_dummy_event_after_room_join(room_id) |
| if len(forward_extremities) > 1: | ||
| # The likelihood of this being used is extremely low, thus only build the handler | ||
| # when necessary. | ||
| _creation_handler = self.hs.get_event_creation_handler() | ||
| await _creation_handler._send_dummy_event_after_room_join(room_id) | ||
|
|
| event, context = await self._on_send_membership_event( | ||
| origin, content, Membership.JOIN, room_id | ||
| ) | ||
| # Collect this now, the internal metadata of event(which should have it) doesn't |
| This should only be triggered when handling a remote join while there was | ||
| events sent during the make_join/send_join handshake. The joining | ||
| homeserver would otherwise not immediately know to backfill this event, | ||
| and would "miss it". |
| # Check the forward extremities for the room here. If there is more than one, it | ||
| # is likely that another event was created in the room during the | ||
| # make_join/send_join handshake. The joining server is likely to thus miss this event | ||
| # until a second event is created when references it - which could be some time. | ||
| # In that case, we proactively send a dummy extensible event that ties these | ||
| # forward extremities together. The remote server will then attempt to backfill | ||
| # the missing event on its own. | ||
| # | ||
| # By not sending the 'missing event' directly, but instead having the joining | ||
| # homeserver backfill it, the stream ordering for the missing event will be | ||
| # "before" the join (which is what we expect). | ||
|
|
||
| forward_extremities = await self.store._get_forward_extremeties_for_room( | ||
| room_id, stream_ordering_of_join.get_max_stream_pos() | ||
| ) | ||
|
|
||
| if len(forward_extremities) > 1: | ||
| # The likelihood of this being used is extremely low, thus only build the handler | ||
| # when necessary. | ||
| _creation_handler = self.hs.get_event_creation_handler() | ||
| await _creation_handler._send_dummy_event_after_room_join(room_id) | ||
|
|
| @@ -0,0 +1 @@ | |||
| Provide remote servers a way to find out about an event created during the remote join handshake. Contributed by @FrenchGithubUser and @jason-famedly @ Famedly. | |||
| # Collect this now, the internal metadata of event(which should have it) doesn't | ||
| stream_ordering_of_join = ( | ||
| await self.store.get_current_room_stream_token_for_room_id(room_id) |
Pull Request Checklist
EventStoretoEventWorkerStore.".code blocks.TLDR
Use a "dummy" event to tie together forward extremities, and proactively send it to all servers in the room. This allows recently joined servers to become aware of recent events that would otherwise have "slipped through the cracks" and thus not be retrievable.
NOTE: While this does send the "dummy" event to all servers in the room, regardless of if they should care or not, at some point a new event will reference this dummy event and require it's retrieval. Since it was proactively sent, this will now not be necessary. This assists in preventing forks in the DAG
Alternatives
Unlike famedly/synapse#51 which 'pushes' the missing event directly, this causes the event to be 'pulled' by referencing it as a
prev_eventof a dummy event. Since the 'dummy event' does not get passed into the client, it is effectively invisible.Draw-backs of famedly/synapse#51 meant it was not always certain if the 'pushed event' would show up in
/syncor in/messages, but usually was in/sync. This method always has the 'missing event' show up in/messages, which I feel is more technically correct as that event was(albeit just barely) created before the 'join event' is persisted.The Process
The order of events:
make_joinfrom remote server, response sentsend_joinfrom remote server, response from local server. Message A is not in this(as it is not state and is not referenced in any events that are included). Join event is persisted on local server.A. Creates a
org.matrix.dummy_eventthat hasprev_eventscontaining both the join and message A.B. Sends this dummy event to all servers in the room.
/sendendpoint, saves it in a queue until the partial state join begins syncing additional room state