Skip to content

MSC4345: Server key identity and room membership#4345

Open
Gnuxie wants to merge 42 commits intomatrix-org:mainfrom
Gnuxie:gnuxie/server-key-identity-and-room-membership
Open

MSC4345: Server key identity and room membership#4345
Gnuxie wants to merge 42 commits intomatrix-org:mainfrom
Gnuxie:gnuxie/server-key-identity-and-room-membership

Conversation

@Gnuxie
Copy link
Copy Markdown
Contributor

@Gnuxie Gnuxie commented Sep 8, 2025

Rendered

Signed-off-by: Gnuxie Gnuxie@protonmail.com

@Gnuxie Gnuxie changed the title MSC0000: Server key identity and room membership MSC4345: Server key identity and room membership Sep 8, 2025
@tulir tulir added requires-room-version An idea which will require a bump in room version proposal A matrix spec change proposal room-spec Something to do with the room version specifications unassigned-room-version Remove this label when things get versioned. kind:core MSC which is critical to the protocol's success needs-implementation This MSC does not have a qualifying implementation for the SCT to review. The MSC cannot enter FCP. labels Sep 8, 2025
Copy link
Copy Markdown
Member

@tulir tulir Sep 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implementation requirements:

  • Server (preferably multiple)
  • Client (preferably multiple)
  • Complement tests

reproducibility and preemptive access control for servers without the
use of a policy server.

### The `m.server.participation` state event, `state_key: ${origin_server_key}`
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still need to steal more prose from MSC4243 to describe the exact format for the server key and then use that consistently.

Copy link
Copy Markdown
Member

@kegsay kegsay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • I found this proposal quite hard to parse due to the addition of unfamiliar terminology which is inadequately described. Some terminology seems to not be described at all e.g "target server's ambient power level" where said terminology appears in auth rules.
  • I found the auth rule changes hard to parse due to what I suspect is incorrect indentation which reads to me as dangling if statements.
  • I think the proposal is actually two proposals, one to handle soft-failure in a more consistent manner, and one to make it the room admins job to verify domains. I'm unsure why the author is combining them; it just increases the chance of the proposal being slowed down / rejected due to trying to do too much. One part of the proposal could have approval but because the other part doesn't, the whole thing gets blocked.
  • It's unclear what purpose advertised_domain serves, and how servers and clients are supposed to use it. I believe the intention is that participating servers verify the server key by talking to that advertised_domain, but this isn't clear from the proposal.

EDIT: this comment previously mentioned that domain-to-key mappings were controlled either by any server or only by privileged server, and thus it either was useless (any server could make the mappings) or centralised (only admins could), neither of which are desirable. It turns out that this proposal makes no domain-to-key mappings at all, and allows any server to admit server public keys if they are already joined to the room. As a result of lacking domain-to-key mappings, I'm unsure how this provides any real traceability guarantees, given this is mentioned as a key differentiator with MSC4243.

Related concerns:

matrix-org#4345 (comment)

Now i need to sort out the request_participation flow.  From there i
will need to modify the authorization rules.  And then just check
everything is consistent and doesn't reference the old participation
states.
This proposal encodes a special auth rule for `revoked` participation
to avoid soft failure and the problems discussed in MSC4104.

## Security considerations
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the state_key is derived from the public key, it is possible for a server to request participation from different servers at the same time (and so keys can be accepted and revoked concurrently, in different auth chains). So consideration of state res needs to be made.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the key issue with this is whether a participating server can collude with another server to erase their participation by creating a conflicting requested participation state. The answer is yes, but doing this does require the servers involved to equivocate about their own causal past which is something we intent to mitigate with the per origin linear chain.

Copy link
Copy Markdown
Contributor Author

@Gnuxie Gnuxie Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Concurrent /send_participation before first join. A homeserver deployment that is not resident to the room could /send_participation concurrently to servers already resident to the room.

  2. Equivocation of /send_participation while accepted. A homeserver deployment that is resident to the room could /send_participation concurrently to their existing accepted participation with an identical key.

Question: How do we make sure new joiners are not screwed by this behaviour? We likely do not have much option but to depend on reverse topological power ordering in state res to bail us out. Which is a bit ugly because it's an implicit effect but the precedent lies there for all auth events.


### The `m.server.participation` state event, `state_key: ${origin_server_key}`

#### The `unverified_domain` property
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need changing to unverified_server_name?

This is probably easier for people to understand if they're unfamiliar
with capability-security and access control terminology.
Comment on lines +479 to +483
## Unstable prefix

- `m.server.participation` -> `org.matrix.msc4345.participation`
- `/_matrix/key/v3/query` => `_matrix/key/v3/org.matrix.msc4345/query`

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing prefixes for make_participation send_participation

Servers can only accept invitations and emit a join event when their current
participation state in the room is set to `accepted`.

### Key revocation
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to explain that m.room.server_acl still has a function in this MSC as it provides an equivalent to a homeserver suspension as opposed to a full ban. Obviously, m.room.server_acl's main capability in banning servers is now redundant but the event can now be used to provide temporary bans while homeservers are investigated without revoking their participation (which requires the homeserver to generate a new identity for the room)

Servers can only accept invitations and emit a join event when their current
participation state in the room is set to `accepted`.

### Key revocation
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is possible for homeservers to have concurrent participation via multiple keys (not that room admins would accept them, but what i mean is that it is possible on a DAG level). We need to explain that should this MSC be used in an intermediate room version that exists prior to MSC4348 being merged, that this means users could theoretically also have concurrent membership. But they would still be identifiable via the server name so it doesn't have quite the same implications for ban evasion, the concern is more about it just being strange.

Copy link
Copy Markdown
Contributor Author

@Gnuxie Gnuxie Dec 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This trait is derived from a feature, which is key revocation is final. And key revocation has to be final so that it is not possible for anyone to make use of stolen keys (including room admins1).

Footnotes

  1. which is why the decision to introduce privileged creators wasn't great and in time will be seen as regrettable. Because even room admins and creators do need to be accountable within their own rooms. Don't let anyone appeal to the benefits of hindsight when this happens.

Comment on lines +411 to +423
### MSC4243: User ID localparts as Account Keys

This proposal is a parallel exploration to
[MSC4243: User ID localparts as Account Keys](https://github.com/matrix-org/matrix-spec-proposals/pull/4243)
and borrows several ideas from the same proposal. It is not required reading.
The key difference between these proposals is that this proposal describes long
lived identity for servers as a key pair in Matrix rooms. Whereas MSC4243 only
does so for individual user accounts.

The critical difference between the proposals is whether or not to include the
server as a first-class participant in the DAG, which is used as as an
attestation of trust and responsibility for the membership for each user. With
the trade-off of reducing metadata.
Copy link
Copy Markdown
Contributor Author

@Gnuxie Gnuxie Dec 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is possible to create a compromising MSC between the two, by modifying m.room.member to give it a state that is "default power level safe". I.e. new identities aren't given implicit access to the room without first going through a room admins with the allow power level.

The key to this is distinguishing the allow level from the invite level in private rooms in order to cleanly separate the accept step from invite.

Having someone with the accept level acknowledge the newly invited users is something that could happen semi-automatically in private rooms. For example one simple way of managing this is clients doing some basic rate limiting to reduce the risk of someone adding a tonne of new identities.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Such an MSC would have to introduce a handshake whereby the room admins with the allow level can obtain information about each new user's identity through a DID service (or a matrix homeserver).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In any case, in order for MSC4243 and MSC4348 to obtain the same guarantees as this MSC with respect to revocation of keys, room membership would need to be changed to model the same final revocation state

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is interest I will create this MSC

Copy link
Copy Markdown
Contributor Author

@Gnuxie Gnuxie Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Such an msc should first attempt to remove the server name from the user mxid from the beginning and provide another way to identify the server, such as through edus and zero knowledge proofs. Might be a tall order.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be possible to repurpose MSC4348 for this and remove its strict dependency on this MSC

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So i don't such a version of MSC4348 without MSC4345, or MSC4243, would be safe even if the room version includes something like #4106 to make new identities "default power level safe". This is because the cost we have now to minting server identity would be removed in those MSCs and we would still allow people to write to room state by introducing new users into the mute state. At the bare minimum this would also need traceability with a hard requirement of co-sigining joins with join_authorised_via_user in all join rules. But there would need to be other mitigations such as invitation quotas (which again would probably require membership to become directional and need a final revoke instead of ban)

Copy link
Copy Markdown
Contributor Author

@Gnuxie Gnuxie Dec 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depending on the trade-offs that you want to make wrt traceability and metadata, it may be possible for only the join to be co-signed by a server (provided the user's membership is scoped and quotad and we have much stronger equivocation mitigation and not a full DAG like today) but i'm not sure people are going to like that either?

Removing the signature from matrix and instead providing it to a room moderator as part of a handshake is also a possibility (as stated many times over and probably should get a mention in the MSC) but we don't really have any infrastructure for these kinds of handshakes yet. So i'd be inclined to keep the metadata for now and then remove it when we develop the handshake that would allow server information, DID lookup, reputation lookup, whatever to be conducted by moderators before they provide access.

Copy link
Copy Markdown
Contributor Author

@Gnuxie Gnuxie Dec 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well what metadata are we even leaking by co-signing all events with a server key? it shouldn't be too hard to make the unverified_server_name property a secret and the server keys are already room scoped. Which probably makes it stronger than the MSC4243 account key, which is common to all rooms the user participates within. The metadata argument does not seem relevant. But we should certainly elaborate on how to reduce it and summarise the conversation here.

This is the power level required to accept requests to participate. Defaults to
`100`.

### The `m.server.participation` state event, `state_key: ${origin_server_key}`
Copy link
Copy Markdown
Contributor Author

@Gnuxie Gnuxie Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making the state key some composite of the unverified_server_name and the public key would help ensure that unverified_server_name is invariant for any conflicting event.

Copy link
Copy Markdown
Contributor Author

@Gnuxie Gnuxie Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could also go further and make the state key a composite of the unverified_server_name, participation and the public key. So that those properties are also invariant for any conflicting participation event related to the public key.

Copy link
Copy Markdown
Contributor Author

@Gnuxie Gnuxie Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You would then also want to include a composite of the requester and accepter and revoker though depending on the value of the participation property

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need to go this far to prevent trickery with conflicting events? A different accepter and requester doesn't have any downstream effect. Changing unverified_server_name would. It's an interesting technique that could be adopted for other events in future though.

Copy link
Copy Markdown
Member

@kegsay kegsay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • I like the idea of server keys as user ID domains.
  • I don't like the server participation concept as I feel it can be sufficiently emulated using knocks and invites.
  • I really like the idea of having an authorised server check the domain then give the 👍🏼 to let the server participate.

I think there is probably yet another iteration of these proposals which combines aspects of this MSC together. Something broadly along the lines of:

  • User ID domains are server keys as you describe.
  • join_rule: public has different semantics in this new room version:
    • To join you must first knock. We reuse existing spam protection mechanisms that are in place for knock spam.
    • Any server authorised to invite can accept the knock (as per the spec). We modify the default invite level for public rooms to be mods/admins.
    • We allow those servers to automatically send the invite upon verifying the domain (questionable who should issue this though as masquerading isn't very friendly).
    • The joining server can auto-accept the invite (as per the spec).

That in essence is what this MSC is doing I think, but doesn't involve new auth rules or event types.

## Proposal

We propose to make the server's identity within a room solely a long lived
ed25519 public key. This key is explicitly appended to the DAG via an auth
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this idea as an alternative to per-user ed25519 keys.

It does bake in the idea of there being a server, but it should be possible to contort it to work for P2P and portable accounts. For example, Pinecone used ed25519 keys as routing information (so they were the domains) and that worked fine, if a bit odd with @user:ed25519-key where the localpart is just useless junk.

The bigger problem is portable accounts. By associating users so strongly with a signing key, it makes it harder/impossible to enable scenarios like "As a matrix.org user, I want to port my account to example.com" as your identity is baked into the signing key of matrix.org. More likely we would need to first enable P2P such that the server signing key == your P2P identity, before allowing portable accounts.


In addition to this, we strengthen the conditions of server participation in the
DAG. So that new server identities must obtain explicit access from room
administrators:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer it if this was a separate MSC, as it massively complicates the first point of converting user ID domains to be signing keys.


- Servers are unable to participate within a room until their key has been added
by an existing participant. This principally ensures the introduction of
server keys is traceable to existing participants. Without this traceability,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this sufficiently different from knocks/invites? Can we reuse existing infrastructure rather than invent a new concept?

Copy link
Copy Markdown
Contributor Author

@Gnuxie Gnuxie Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right that we could just make a social shift around the definition of the public join rule. And change the rule to be more like knock by default. And I'd be very happy with that provided we can gate the ability to make knock to room admin infrastructure. This is specifically so that access to knocking on a room isn't ambient and the knocks have to be signed by e.g. a policy server or the join-gate idea from the feedback of MSC4243. The reason for that is so that knocking cannot be used as a means of filling the room up with membership events, the same way as joining can currently.

server keys is traceable to existing participants. Without this traceability,
the ability to add an infinite number of new server keys is available
implicitly to anyone who is able to federate with a by-standing participant or
malicious leaky server. This change provides participants that have a newly
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intro has:

Therefore the cost to creating new server identities with server names is significant. And this cost is a significant factor in deterring attacks that the room model is currently vulnerable to.

which I agree with. This doesn't quite add up here though. We want to add participation auth rules because of the risk of servers minting a ton of server keys, but this requires the attacker to run a homeserver to pull off, which is at odds with:

Throughout Matrix's history, attackers have minted new user identities by exploiting
homeservers with weak registration requirements instead of minting new homeservers

which implies attackers are using unmodified, otherwise trusted homeservers.

I get there's an algorithmic step change (you only need 1 domain name without participation auth rules, instead of N domain names with), but this should be clarified.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point I'm trying to make here is if we naively added server keys in a similar way to MSC4243 without new auth rules or anything then we would make it much easier to add users from new servers to the room (or invalid ones). You might need to spin up a server to do that, but that server would be able to generate keys for new servers without the infra to back them up. Essentially it's the discussion from this thread #4243 (comment).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will make it clearer

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind:core MSC which is critical to the protocol's success needs-implementation This MSC does not have a qualifying implementation for the SCT to review. The MSC cannot enter FCP. proposal A matrix spec change proposal requires-room-version An idea which will require a bump in room version room-spec Something to do with the room version specifications unassigned-room-version Remove this label when things get versioned.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants