Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
115 changes: 115 additions & 0 deletions proposals/4259-bulk-profile-sync.md
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implementation requirements:

  • Server(s)

Copy link
Copy Markdown
Member

@anoadragon453 anoadragon453 Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A discussion on performance will inevitably come up on this MSC, so I'd like to start that up here (and see the MSC itself address it).

Presence Performance

Many reviewing this MSC will immediately think of the performance problems that
plague the Presence
feature.

The Synapse homeserver's implementation of presence works via the following:

  1. Presence information can be set by the client, but the server will use
    requests such as /sync and read receipts to determine if the user is
    "currently active" or when they were last active.
  2. Presence information is shared between all users who share a room.
  3. Online user's presence state is changed to "idle" if they are not currently
    active and their last active time was >5min ago. They are set back to "online"
    when they make a relevant request.
  4. When a user joins a room, their presence information is sent to other users,
    and they receive presence information about all other users in the room.

From 2020 data, Element observed in a publicly-federating customer's deployment
that a homeserver with 66 daily active users saw:

  1. ~16Hz total (local and remote) presence updates
  2. ~3Hz coming from remote servers
  3. ~1.5Hz of updates proactively being sent to clients and other servers.

The majority of CPU usage was going towards sending federation transactions
(which dropped from ~20Hz to ~1Hz after disabling presence). Spikes occurred
when a local user joined a room, which involved sending presence to every server
in the room, and receiving presence for every server in the room.

They also of course occurred whenever a user - that was in rooms with many remote
servers - updated their presence.

Much of this can be optimised through implementation and protocol changes.

Perhaps we could weave presence information into /{make,send}_join, or have a
separate, single call to bulk push and receive presence information for all
users in the room that belong to a given homeserver.

Implementations can spread updates to different servers out over time,
preventing many transactions being sent in parallel overloading the homeserver.
They can also rate-limit and "debounce" updates (ignoring some that appear in a
quick succession). They can optimise their HTTP clients so TLS setup costs are
light on CPU resources.

I still think profile fields hold value in their functionality over presence.
But similar optimisations can likely be applied to both features.

Profile Field-specific quirks

That being said, profile updates will generally be much less frequent than
presence updates in general. Profile fields make it easy to set different
rate-limits/debouncing logic for each field ID, e.g. you could limit m.call
updates much more strictly than m.music. Each field has its own traffic
patterns to consider.

I also think limiting who can see which profile will significantly cut down on
the traffic. A mechanism for allowing anyone to see your avatar/displayname, but
only friends seeing your current song, would go a long way. Presence didn't have
such a granular level of user-controlled recipients. This is best handled in a
separate MSC.

Original file line number Diff line number Diff line change
@@ -0,0 +1,115 @@
# MSC4259: Profile Update EDUs for Federation

Currently, homeservers must individually request profiles via
[`/_matrix/federation/v1/query/profile`](https://spec.matrix.org/v1.13/server-server-api/#get_matrixfederationv1queryprofile)
when they need to display user information.

This can lead to significant inefficiencies in Matrix federation, as servers must make separate
requests for each profile they need, cannot efficiently detect when profiles change, and thus risk
serving stale data to their users. The problem becomes more acute as servers cache profile data to
reduce federation traffic, requiring a careful balance between cache duration and data freshness.

This issue has always existed to some degree, but has gained urgency to accommodate profile changes
introduced in [MSC4133](https://github.com/matrix-org/matrix-spec-proposals/pull/4133).

This proposal specifically focuses on efficient delivery of profile updates between servers. While
client delivery of profile updates is also important, that solution is to be addressed separately,
such as through a sliding sync extension like
[MSC4262](https://github.com/matrix-org/matrix-spec-proposals/pull/4262).

## Proposal

The current approach to federation profile lookups has several issues:

- Inefficient use of network resources when requesting multiple profiles
- Increased latency when displaying user information
- Difficulty in maintaining accurate (non-stale) cached copies of remote profiles
- No standardised way to efficiently detect when profiles have been updated

This MSC adds a new `m.profile` EDU type that servers generate when users update their profiles,
allowing real-time notifications of profile changes between servers.

### Behaviour

1. When a user updates their profile via client-server APIs, their homeserver:
- Processes and stores the profile update as per normal
- MAY generate EDUs to notify other servers
- If sending EDUs, SHOULD only send them to servers that share a room with the user,
filtering/ratelimiting/de-duplicating the broadcasts as needed to meet their own policies

2. Remote servers receiving the EDU:
- MAY use these EDUs to update cached profile data, and thus hold onto cached profiles for longer
- COULD ignore the EDU if they are not interested in updates for this user

3. EDU handling:
- Servers MAY cache profile data from EDUs to reduce future federation traffic
- Profile information is considered public, so servers MAY broadcast to any known server
- Servers MUST NOT forward EDUs to other servers

### EDU Format

```json5
{
"type": "m.profile",
"content": {
"user_id": "@alice:example.com",
"fields": {
"displayname": "Alice",
"avatar_url": null, // Signals removal of avatar_url
"org.example.language": "en-GB"
}
}
}
```

The EDU contains only fields that have changed. Fields set to `null` should be considered removed.
Omitted fields should be considered unchanged. Recipients can fetch the full profile using the
existing federation API if they need to verify the complete state.

### Implementation Notes

- All profile field values follow the same validation rules as the existing profile endpoints

- Servers should implement appropriate rate limiting for EDU generation/sending, and MAY delay
notifications to de-duplicate/combine multiple field updates into a single EDU

- While this EDU system reduces the need for manual profile requests, implementations should note:
- Remote servers may not support or send these EDUs
- EDUs can occasionally fail to be delivered
- Both servers and clients COULD implement periodic re-fetching of profiles (e.g. weekly or
monthly) if they require stronger consistency guarantees
- The frequency of such re-fetching should be balanced against available resources, network
conditions, and desired data freshness

- Profile updates are typically much less frequent than other EDU types like presence updates,
so broadcasting these small delta updates to servers sharing rooms with the user is considered
efficient and scalable. Receiving servers can quickly filter unwanted updates with minimal
processing overhead.

## Security Considerations

1. EDUs follow standard federation authentication rules

2. Profile information is considered public data in Matrix

3. Rate limiting helps prevent abuse

## Alternatives

1. Pull-based sync API
- Could be more efficient for small servers that want to sync on their own schedule
- Allows servers to batch multiple profile requests
- However, doesn't scale well to servers with thousands of users
- Complex to maintain filter lists of which users each server wants updates for
- Higher latency for profile updates compared to EDU-based approach

2. Webhook-style push notifications for profile changes
- No precedent for this in the current Matrix specification
- Similar scaling issues to pull-based approach regarding filter lists
- Would require new federation endpoints and authentication mechanisms

## Unstable Prefix

Until this proposal is stable, use:

- EDU type: `uk.tcpip.msc4259.profile`