Conversation
This implements state DAGs, without support for federation. A general overview: - It adds a new room version and new event type. - It adds a new field `calculated_auth_event_ids` to internal metadata. - It stores the state DAG via new state DAG edges / forward extremities tables. - It adds new auth rules as per the MSC. - It uses the new `prev_state_events` field instead of `prev_event_ids()` when doing state resolution.
Builds off #19424 Adds federation compatibility for state DAG rooms. Overview: - Adds extra HTTP API fields as per the MSC. - Adds methods for walking and extracting the state DAG for a room (for `/get_missing_events` and `/send_join` respectively). - Adds impl for processing the federation requests, as well as `/send`.
Co-authored-by: Eric Eastwood <erice@element.io>
Co-authored-by: Eric Eastwood <erice@element.io>
Co-authored-by: Eric Eastwood <erice@element.io>
Co-authored-by: Eric Eastwood <erice@element.io>
Co-authored-by: Eric Eastwood <erice@element.io>
Co-authored-by: Eric Eastwood <erice@element.io>
Co-authored-by: Eric Eastwood <erice@element.io>
Co-authored-by: Eric Eastwood <erice@element.io>
Co-authored-by: Eric Eastwood <erice@element.io>
Spawning from #19424 (comment)
| second_id = self.helper.send(self.room_id, body="test2")["event_id"] | ||
| second_prev_state_event = self._get_prev_state_events(second_id) | ||
| assert len(second_prev_state_event) == 1 | ||
| self.assertEquals(first_prev_state_event, second_prev_state_event) |
There was a problem hiding this comment.
Do we care about the order?
If not, might as well use self.assertIncludes(..., exact=True) and for better assertion diff messages.
There was a problem hiding this comment.
We don't care about the order, but there's always only 1 element in this test so I don't think it will make the message hard to follow.
There was a problem hiding this comment.
Using something like self.assertIncludes(..., exact=True) (or self.assertCountEqual(...)) aligns our assertion with what we care about (we don't care about order). People don't have to ask the question about whether we have to preserve the that part of the assertion through refactors.
| self.assertEqual( | ||
| new_extrems, | ||
| want_new_extrems, | ||
| f"want_new_extrems={want_new_extrems} got={new_extrems}", | ||
| ) |
There was a problem hiding this comment.
self.assertIncludes(..., exact=True) for better diff messages
There was a problem hiding this comment.
I tried doing this but mypy doesn't like it.
tests/storage/test_msc4242_state_dag.py:67: error: Argument 1 to "assertIncludes" of "TestCase" has incompatible type "list[str]"; expected "AbstractSet[Never]" [arg-type]
tests/storage/test_msc4242_state_dag.py:67: error: Argument 2 to "assertIncludes" of "TestCase" has incompatible type "list[str]"; expected "AbstractSet[Never]" [arg-type]
tests/storage/test_msc4242_state_dag.py:83: error: Argument 1 to "assertIncludes" of "TestCase" has incompatible type "list[str]"; expected "AbstractSet[Never]" [arg-type]
tests/storage/test_msc4242_state_dag.py:83: error: Argument 2 to "assertIncludes" of "TestCase" has incompatible type "list[str]"; expected "AbstractSet[Never]" [arg-type]
tests/storage/test_msc4242_state_dag.py:90: error: Argument 1 to "assertIncludes" of "TestCase" has incompatible type "list[str]"; expected "AbstractSet[Never]" [arg-type]
tests/storage/test_msc4242_state_dag.py:90: error: Argument 2 to "assertIncludes" of "TestCase" has incompatible type "list[Any]"; expected "AbstractSet[Never]" [arg-type]
tests/storage/test_msc4242_state_dag.py:103: error: Argument 1 to "assertIncludes" of "TestCase" has incompatible type "list[str]"; expected "AbstractSet[Never]" [arg-type]
tests/storage/test_msc4242_state_dag.py:103: error: Argument 2 to "assertIncludes" of "TestCase" has incompatible type "list[Any]"; expected "AbstractSet[Never]" [arg-type]
tests/storage/test_msc4242_state_dag.py:107: error: Argument 1 to "assertIncludes" of "TestCase" has incompatible type "list[str]"; expected "AbstractSet[Never]" [arg-type]
tests/storage/test_msc4242_state_dag.py:107: error: Argument 2 to "assertIncludes" of "TestCase" has incompatible type "list[Any]"; expected "AbstractSet[Never]" [arg-type]
..because it wants a set not a list. So going back to assertEqual.
There was a problem hiding this comment.
Creating a set from a list is as easy as set(...).
want_new_extrems is already a set and new_extrems is already converted to a set. Should work
synapse/tests/storage/test_msc4242_state_dag.py
Lines 162 to 185 in d59f31a
There was a problem hiding this comment.
Resolved without change
| known_prev_state_event_ids = set(prev_state_events) | ||
| raise AssertionError( | ||
| f"Event {event.event_id} has unknown prev_state_events " | ||
| + f"{len(prev_state_events)} != {len(prev_state_events_ids)} " |
There was a problem hiding this comment.
To avoid some of the jumble of output we can group this together better with a description.
| + f"{len(prev_state_events)} != {len(prev_state_events_ids)} " | |
| + f"({len(prev_state_events)}/{len(prev_state_events_ids)} known)" |
| # Events are always processed in causal order without any gaps in the DAG | ||
| # (prev_state_events are always known), guaranteeing that processed events have a path to the | ||
| # create event. This is an emergent property of state DAGs as asserting that there is a path | ||
| # to the create event every time we insert an event would be prohibitively expensive. | ||
| # This is similar to how doubly-linked lists can potentially not refer to previous items correctly | ||
| # without verifying the list's integrity, but doing it on every insert is too expensive. |
There was a problem hiding this comment.
I think I would also expect this to live in _calculate_new_state_dag_extremities where we deal with missing prev events.
Previous discussion #19424 (comment)
| second_id = self.helper.send(self.room_id, body="test2")["event_id"] | ||
| second_prev_state_event = self._get_prev_state_events(second_id) | ||
| assert len(second_prev_state_event) == 1 | ||
| self.assertEquals(first_prev_state_event, second_prev_state_event) |
There was a problem hiding this comment.
Using something like self.assertIncludes(..., exact=True) (or self.assertCountEqual(...)) aligns our assertion with what we care about (we don't care about order). People don't have to ask the question about whether we have to preserve the that part of the assertion through refactors.
| # TODO(kegan): better way to typecast and get at this field? | ||
| prev_state_events = event.get_dict().get( | ||
| "prev_state_events", None | ||
| ) |
There was a problem hiding this comment.
TODO
Can we use assert isinstance(event, FrozenEventVMSC4242) here?
| # the most recently created state event | ||
| prev_state_event: list[str] | None = ( |
There was a problem hiding this comment.
| # the most recently created state event | |
| prev_state_event: list[str] | None = ( | |
| # This should be the most recently created state event as we create each event | |
| prev_state_events: list[str] | None = ( |
| ) | ||
| return await ret.get_state(self._state_storage_controller, state_filter) | ||
|
|
||
| # TODO: Remove as this is unused |
| ) = await self._calculate_new_forward_extremities_and_state_delta( | ||
| room_id, chunk | ||
| ) | ||
| if room_version and room_version.msc4242_state_dags: |
There was a problem hiding this comment.
Redundant
| if room_version and room_version.msc4242_state_dags: | |
| if room_version.msc4242_state_dags: |
| # The number of forward extremities for each new event. | ||
| msc4242_state_dag_forward_extremities_counter = Histogram( | ||
| "synapse_storage_msc4242_state_dag_forward_extremities_persisted", | ||
| "Number of forward extremities for each new event in the state DAG", | ||
| labelnames=[SERVER_NAME_LABEL], | ||
| buckets=(1, 2, 3, 5, 7, 10, 15, 20, 50, 100, 200, 500, "+Inf"), | ||
| ) |
There was a problem hiding this comment.
Perhaps we should update contrib/grafana/synapse.json with a matching graph (can copy from the existing graphs that use synapse_storage_events_forward_extremities_persisted)
| CREATE INDEX msc4242_state_dag_edges_by_room ON msc4242_state_dag_edges(room_id); | ||
| CREATE UNIQUE INDEX msc4242_state_dag_edges_key ON msc4242_state_dag_edges(room_id, event_id, prev_state_event_id); |
There was a problem hiding this comment.
I think because the second index has room_id as the first column, it will cover the first index already
| ) | ||
| prev_state_events.update(fetched_prev_state_events) | ||
| if len(prev_state_events) != len(prev_state_events_ids): | ||
| # we should have all the prev state events by now, so if we do not, that suggests |
There was a problem hiding this comment.
| # we should have all the prev state events by now, so if we do not, that suggests | |
| # we should have all the `prev_state_events` by now, so if we do not, that suggests |
| prev_state_events.update(fetched_prev_state_events) | ||
| if len(prev_state_events) != len(prev_state_events_ids): | ||
| # we should have all the prev state events by now, so if we do not, that suggests | ||
| # a synapse programming error |
There was a problem hiding this comment.
| # a synapse programming error | |
| # a Synapse programming error |
| self.assertEqual(channel.json_body["errcode"], "M_FORBIDDEN") | ||
|
|
||
| @parameterized.expand([(k,) for k in KNOWN_ROOM_VERSIONS.keys()]) | ||
| # Exclude MSC4242 room versions whilst it lacks federation support |
There was a problem hiding this comment.
| # Exclude MSC4242 room versions whilst it lacks federation support | |
| # FIXME: Exclude MSC4242 room versions whilst it lacks federation support |
| self._test_get_extremities_common(room_version) | ||
|
|
||
| @parameterized.expand([(k,) for k in KNOWN_ROOM_VERSIONS.keys()]) | ||
| # Exclude MSC4242 room versions whilst it lacks federation support |
There was a problem hiding this comment.
| # Exclude MSC4242 room versions whilst it lacks federation support | |
| # FIXME: Exclude MSC4242 room versions whilst it lacks federation support |
This implements MSC4242: State DAGs, without support for federation.
A general overview:
calculated_auth_event_idsto internal metadata.prev_state_eventsfield instead ofprev_event_ids()when doing state resolution.Complement tests: matrix-org/complement#841
Pull Request Checklist
EventStoretoEventWorkerStore.".code blocks.