Skip to content

MSC4242: State DAGs (CSAPI)#19424

Open
kegsay wants to merge 28 commits intodevelopfrom
kegan/4242-csapi
Open

MSC4242: State DAGs (CSAPI)#19424
kegsay wants to merge 28 commits intodevelopfrom
kegan/4242-csapi

Conversation

@kegsay
Copy link
Copy Markdown
Contributor

@kegsay kegsay commented Feb 3, 2026

This implements MSC4242: State DAGs, without support for federation.

A general overview:

  • It adds a new room version and new event type.
  • It adds a new field calculated_auth_event_ids to internal metadata.
  • It stores the state DAG via new state DAG edges / forward extremities tables.
  • It adds new auth rules as per the MSC.
  • It uses the new prev_state_events field instead of prev_event_ids() when doing state resolution.

Complement tests: matrix-org/complement#841

Pull Request Checklist

  • Pull request is based on the develop branch
  • Pull request includes a changelog file. The entry should:
    • Be a short description of your change which makes sense to users. "Fixed a bug that prevented receiving messages from other servers." instead of "Moved X method from EventStore to EventWorkerStore.".
    • Use markdown where necessary, mostly for code blocks.
    • End with either a period (.) or an exclamation mark (!).
    • Start with a capital letter.
    • Feel free to credit yourself, by adding a sentence "Contributed by @github_username." or "Contributed by [Your Name]." to the end of the entry.
  • Code style is correct (run the linters)

This implements state DAGs, without support for federation.

A general overview:
 - It adds a new room version and new event type.
 - It adds a new field `calculated_auth_event_ids` to internal metadata.
 - It stores the state DAG via new state DAG edges / forward extremities tables.
 - It adds new auth rules as per the MSC.
 - It uses the new `prev_state_events` field instead of `prev_event_ids()` when doing state resolution.
@kegsay kegsay requested a review from a team as a code owner February 3, 2026 09:33
kegsay added a commit that referenced this pull request Feb 3, 2026
Builds off #19424

Adds federation compatibility for state DAG rooms. Overview:
 - Adds extra HTTP API fields as per the MSC.
 - Adds methods for walking and extracting the state DAG for a room (for `/get_missing_events` and `/send_join` respectively).
 - Adds impl for processing the federation requests, as well as `/send`.
@kegsay kegsay mentioned this pull request Feb 3, 2026
3 tasks
kegsay and others added 3 commits March 10, 2026 14:49
Co-authored-by: Eric Eastwood <erice@element.io>
Co-authored-by: Eric Eastwood <erice@element.io>
kegsay and others added 6 commits March 10, 2026 15:03
second_id = self.helper.send(self.room_id, body="test2")["event_id"]
second_prev_state_event = self._get_prev_state_events(second_id)
assert len(second_prev_state_event) == 1
self.assertEquals(first_prev_state_event, second_prev_state_event)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we care about the order?

If not, might as well use self.assertIncludes(..., exact=True) and for better assertion diff messages.

Copy link
Copy Markdown
Contributor Author

@kegsay kegsay Mar 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't care about the order, but there's always only 1 element in this test so I don't think it will make the message hard to follow.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using something like self.assertIncludes(..., exact=True) (or self.assertCountEqual(...)) aligns our assertion with what we care about (we don't care about order). People don't have to ask the question about whether we have to preserve the that part of the assertion through refactors.

Comment on lines +189 to +193
self.assertEqual(
new_extrems,
want_new_extrems,
f"want_new_extrems={want_new_extrems} got={new_extrems}",
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.assertIncludes(..., exact=True) for better diff messages

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried doing this but mypy doesn't like it.

tests/storage/test_msc4242_state_dag.py:67: error: Argument 1 to "assertIncludes" of "TestCase" has incompatible type "list[str]"; expected "AbstractSet[Never]"  [arg-type]
tests/storage/test_msc4242_state_dag.py:67: error: Argument 2 to "assertIncludes" of "TestCase" has incompatible type "list[str]"; expected "AbstractSet[Never]"  [arg-type]
tests/storage/test_msc4242_state_dag.py:83: error: Argument 1 to "assertIncludes" of "TestCase" has incompatible type "list[str]"; expected "AbstractSet[Never]"  [arg-type]
tests/storage/test_msc4242_state_dag.py:83: error: Argument 2 to "assertIncludes" of "TestCase" has incompatible type "list[str]"; expected "AbstractSet[Never]"  [arg-type]
tests/storage/test_msc4242_state_dag.py:90: error: Argument 1 to "assertIncludes" of "TestCase" has incompatible type "list[str]"; expected "AbstractSet[Never]"  [arg-type]
tests/storage/test_msc4242_state_dag.py:90: error: Argument 2 to "assertIncludes" of "TestCase" has incompatible type "list[Any]"; expected "AbstractSet[Never]"  [arg-type]
tests/storage/test_msc4242_state_dag.py:103: error: Argument 1 to "assertIncludes" of "TestCase" has incompatible type "list[str]"; expected "AbstractSet[Never]"  [arg-type]
tests/storage/test_msc4242_state_dag.py:103: error: Argument 2 to "assertIncludes" of "TestCase" has incompatible type "list[Any]"; expected "AbstractSet[Never]"  [arg-type]
tests/storage/test_msc4242_state_dag.py:107: error: Argument 1 to "assertIncludes" of "TestCase" has incompatible type "list[str]"; expected "AbstractSet[Never]"  [arg-type]
tests/storage/test_msc4242_state_dag.py:107: error: Argument 2 to "assertIncludes" of "TestCase" has incompatible type "list[Any]"; expected "AbstractSet[Never]"  [arg-type]

..because it wants a set not a list. So going back to assertEqual.

Copy link
Copy Markdown
Contributor

@MadLittleMods MadLittleMods Mar 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Creating a set from a list is as easy as set(...).

want_new_extrems is already a set and new_extrems is already converted to a set. Should work

def _test(
self,
current_fwds: list[str],
new_events: list[tuple[FrozenEventVMSC4242, EventContext]],
want_new_extrems: set[str],
want_raises: bool = False,
) -> None:
"""
Tests the logic of _calculate_new_state_dag_extremities.
Tests that the new extremities calculated as a result of processing current_fwds and new_events
matches want_new_extrems or raises if want_raises is True.
"""
coroutine = self.persistence._calculate_new_state_dag_extremities(
self.room_id,
frozenset(current_fwds),
new_events,
)
if want_raises:
f = self.get_failure(coroutine, SynapseError)
assert f is not None
return
new_extrems = set(self.get_success(coroutine))

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved without change

@kegsay kegsay requested a review from MadLittleMods March 18, 2026 07:58
MadLittleMods added a commit that referenced this pull request Mar 18, 2026
known_prev_state_event_ids = set(prev_state_events)
raise AssertionError(
f"Event {event.event_id} has unknown prev_state_events "
+ f"{len(prev_state_events)} != {len(prev_state_events_ids)} "
Copy link
Copy Markdown
Contributor

@MadLittleMods MadLittleMods Mar 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid some of the jumble of output we can group this together better with a description.

Suggested change
+ f"{len(prev_state_events)} != {len(prev_state_events_ids)} "
+ f"({len(prev_state_events)}/{len(prev_state_events_ids)} known)"

Comment on lines +117 to +122
# Events are always processed in causal order without any gaps in the DAG
# (prev_state_events are always known), guaranteeing that processed events have a path to the
# create event. This is an emergent property of state DAGs as asserting that there is a path
# to the create event every time we insert an event would be prohibitively expensive.
# This is similar to how doubly-linked lists can potentially not refer to previous items correctly
# without verifying the list's integrity, but doing it on every insert is too expensive.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I would also expect this to live in _calculate_new_state_dag_extremities where we deal with missing prev events.

Previous discussion #19424 (comment)

second_id = self.helper.send(self.room_id, body="test2")["event_id"]
second_prev_state_event = self._get_prev_state_events(second_id)
assert len(second_prev_state_event) == 1
self.assertEquals(first_prev_state_event, second_prev_state_event)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using something like self.assertIncludes(..., exact=True) (or self.assertCountEqual(...)) aligns our assertion with what we care about (we don't care about order). People don't have to ask the question about whether we have to preserve the that part of the assertion through refactors.

Comment on lines +490 to +493
# TODO(kegan): better way to typecast and get at this field?
prev_state_events = event.get_dict().get(
"prev_state_events", None
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO

Can we use assert isinstance(event, FrozenEventVMSC4242) here?

Comment on lines +1489 to +1490
# the most recently created state event
prev_state_event: list[str] | None = (
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# the most recently created state event
prev_state_event: list[str] | None = (
# This should be the most recently created state event as we create each event
prev_state_events: list[str] | None = (

)
return await ret.get_state(self._state_storage_controller, state_filter)

# TODO: Remove as this is unused
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO

) = await self._calculate_new_forward_extremities_and_state_delta(
room_id, chunk
)
if room_version and room_version.msc4242_state_dags:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Redundant

Suggested change
if room_version and room_version.msc4242_state_dags:
if room_version.msc4242_state_dags:

Comment on lines +118 to +124
# The number of forward extremities for each new event.
msc4242_state_dag_forward_extremities_counter = Histogram(
"synapse_storage_msc4242_state_dag_forward_extremities_persisted",
"Number of forward extremities for each new event in the state DAG",
labelnames=[SERVER_NAME_LABEL],
buckets=(1, 2, 3, 5, 7, 10, 15, 20, 50, 100, 200, 500, "+Inf"),
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we should update contrib/grafana/synapse.json with a matching graph (can copy from the existing graphs that use synapse_storage_events_forward_extremities_persisted)

Comment on lines +41 to +42
CREATE INDEX msc4242_state_dag_edges_by_room ON msc4242_state_dag_edges(room_id);
CREATE UNIQUE INDEX msc4242_state_dag_edges_key ON msc4242_state_dag_edges(room_id, event_id, prev_state_event_id);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think because the second index has room_id as the first column, it will cover the first index already

)
prev_state_events.update(fetched_prev_state_events)
if len(prev_state_events) != len(prev_state_events_ids):
# we should have all the prev state events by now, so if we do not, that suggests
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# we should have all the prev state events by now, so if we do not, that suggests
# we should have all the `prev_state_events` by now, so if we do not, that suggests

prev_state_events.update(fetched_prev_state_events)
if len(prev_state_events) != len(prev_state_events_ids):
# we should have all the prev state events by now, so if we do not, that suggests
# a synapse programming error
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# a synapse programming error
# a Synapse programming error

self.assertEqual(channel.json_body["errcode"], "M_FORBIDDEN")

@parameterized.expand([(k,) for k in KNOWN_ROOM_VERSIONS.keys()])
# Exclude MSC4242 room versions whilst it lacks federation support
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Exclude MSC4242 room versions whilst it lacks federation support
# FIXME: Exclude MSC4242 room versions whilst it lacks federation support

self._test_get_extremities_common(room_version)

@parameterized.expand([(k,) for k in KNOWN_ROOM_VERSIONS.keys()])
# Exclude MSC4242 room versions whilst it lacks federation support
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Exclude MSC4242 room versions whilst it lacks federation support
# FIXME: Exclude MSC4242 room versions whilst it lacks federation support

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants