From 446a7253d2e733e6e824f445da732593c5b12682 Mon Sep 17 00:00:00 2001 From: Tom Foster Date: Thu, 30 Jan 2025 14:21:16 +0000 Subject: [PATCH 1/5] MSC4259 initial commit --- proposals/4259-bulk-profile-sync.md | 186 ++++++++++++++++++++++++++++ 1 file changed, 186 insertions(+) create mode 100644 proposals/4259-bulk-profile-sync.md diff --git a/proposals/4259-bulk-profile-sync.md b/proposals/4259-bulk-profile-sync.md new file mode 100644 index 00000000000..c883f65fa8a --- /dev/null +++ b/proposals/4259-bulk-profile-sync.md @@ -0,0 +1,186 @@ +# MSC4259: Profile Sync API for Federation + +Currently, homeservers must individually request profiles via +[`/_matrix/federation/v1/query/profile`](https://spec.matrix.org/v1.13/server-server-api/#get_matrixfederationv1queryprofile) +when they need to display user information. This leads to significant inefficiencies in the Matrix +federation: servers must make separate requests for each profile they need, cannot efficiently +detect when profiles change, and risk serving stale data to their users. The problem becomes more +acute as servers cache profile data to reduce federation traffic, requiring careful balance between +cache duration and data freshness. + +This proposal introduces a new federation endpoint that allows homeservers to efficiently sync +profile data from multiple users in a single request. It builds upon +[MSC4133](https://github.com/matrix-org/matrix-spec-proposals/pull/4133) which introduced extended +profile fields, and adds a token-based sync mechanism that lets servers track profile changes +without repeatedly requesting unchanged data. + +## Proposal + +The current approach to federation profile lookups has several issues: + +- Inefficient use of network resources when requesting multiple profiles +- Increased latency when displaying user information +- Difficulty in maintaining accurate (non-stale) cached copies of remote profiles +- No standardised way to efficiently detect when profiles have been updated + +This MSC adds a new `/_matrix/federation/v1/profiles` endpoint that accepts batched profile +requests and returns only changed data on subsequent requests using sync tokens. + +### Behaviour + +1. The requesting server provides an array of MXIDs it wishes to sync profiles for + +2. If `last_batch` is provided, only profiles that have changed since that sync token are returned + +3. The responding server: + - MAY filter which profiles it returns based on its policies + - MAY return partial results and indicate this via `next_batch` + - MUST include `next_batch` in successful responses + - MUST NOT return profiles unless they were requested + - SHOULD indicate users that are not present on the server + - SHOULD suggest an appropriate `next_time` for subsequent requests + +4. Missing profiles in the response indicate no changes since `last_batch` + +5. Error responses for specific profiles suggest excluding them from future requests + +### Federation API Changes + +#### New Endpoint + +- **Method**: `POST` +- **Endpoint**: `/_matrix/federation/v1/profiles` +- **Auth**: Standard federation authentication +- **Rate Limiting**: Implementation-defined, with server guidance via `next_time` response field + +#### Request Format + +```json +{ + "last_batch": "opaque_server_token", // Optional sync token from previous response + "accounts": [ // Required array of MXIDs + "@alice:example.com", + "@bob:example.com" + ] +} +``` + +#### Response Format + +```json +{ + "content": { + "@alice:example.com": { + "displayname": "Alice", + "avatar_url": "mxc://example.com/alice", + "org.example.language": "en-GB" + }, + "@bob:example.com": { + "errcode": "M_NOT_FOUND", + "error": "User not found" + } + }, + "next_batch": "opaque_server_token_2", + "next_time": "2024-03-14T12:30:00Z" +} +``` + +#### Error Codes + +Inside profile payloads, standard Matrix error codes are used for errors: + +- `M_FORBIDDEN`: "User profile is not accessible to the requesting server" +- `M_NOT_FOUND`: "User does not exist" + +For entire-endpoint errors, the following standard Matrix error codes are used: + +- **403 response**: + - `M_FORBIDDEN`: "Profile lookup over federation is disabled on this homeserver" + +- **404 response**: + - `M_UNKNOWN_TOKEN`: "Profile sync token unknown or expired" + +- **429 response**: + - `M_LIMIT_EXCEEDED`: "Too many requests" (should use `retry_after_ms` to indicate time to retry) + +### Implementation Notes + +- When no `last_batch` parameter is provided, return current profile data for all requested + accounts. This is typically used for initial sync or to start over when all tokens have expired. + +- Servers MUST maintain a history of sync tokens to allow clients to resume syncs after connection + issues. Servers SHOULD store at least the last 3-5 tokens to provide resilience against temporary + network failures. + +- Servers MAY expire old tokens based on their own policies (e.g. time-based expiry or storage + limits). When a client provides an expired token, servers MUST return `M_UNKNOWN_TOKEN` to prompt + the client to restart their sync chain. + +- The implementation of tokens is left to the server - they could be timestamps, change IDs, or any + other mechanism that allows tracking profile changes. The only requirement is that using a token + in a subsequent request returns all profile changes since that token was issued. + +- All timestamps (e.g. in `next_time`) MUST be UTC and include the 'Z' suffix. + +- The `next_time` field is advisory; servers are recommended to also enforce their own rate limits. + +- Servers MAY redact certain profiles, e.g. refusing to return profiles for users not sharing rooms + with the requesting server. + +- Requesting servers COULD optimise their sync frequency based on user activity, for example: + 1. Reducing frequency or temporarily suspending syncs when users are offline + 2. Resuming more frequent syncs when users become active again + +## Security Considerations + +1. This endpoint follows standard federation authentication rules + +2. Servers maintain control over which profiles they expose + +3. Rate limiting helps prevent abuse + +## Alternatives + +1. Timestamp-based syncing + - Less reliable for tracking exact changes + - Harder to handle clock skew between servers + - No built-in mechanism for detecting missed updates + - More complex to implement correctly across timezones + +2. EDU-based profile updates + - Higher bandwidth usage due to individual EDU requests (especially for each changed field, + if a user updates multiple fields which a pull-based endpoint would help consolidate) + - Requires servers to send EDUs the recipient may not be interested in + - No way to quickly recognise when a recipient is permanently offline to stop sending EDUs + - Difficult for servers to know how long to cache if sender may have stopped sending EDUs + - Profile updates do not necessarily need to be instantaneous like typing notifications + +3. Webhook-style push notifications for profile changes + - No precedent for this in the current Matrix specification + - Similar issues to EDUs regarding bandwidth and wasted outbound traffic + +## Unstable Prefix + +Until this proposal is stable: + +- Endpoint: `/_matrix/federation/unstable/uk.tcpip.msc4259/profiles` +- Client feature flag: `uk.tcpip.msc4259` + +### Feature Flag Advertisement + +Servers implementing this endpoint MUST advertise support via the `/_matrix/federation/v1/version` endpoint: + +```json +{ + "server": { + "name": "Synapse", + "version": "1.99.0" + }, + "unstable_features": { + "uk.tcpip.msc4259": true + } +} +``` + +Once this MSC is merged, servers SHOULD advertise `uk.tcpip.msc4259.stable` until the next spec +version where these endpoints are officially written into the spec. From 1a76ffaa94b99fe2f6bb25ecddbfb713bd638a1d Mon Sep 17 00:00:00 2001 From: Tom Foster Date: Thu, 30 Jan 2025 14:25:25 +0000 Subject: [PATCH 2/5] Correct title --- proposals/4259-bulk-profile-sync.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/4259-bulk-profile-sync.md b/proposals/4259-bulk-profile-sync.md index c883f65fa8a..2be72639a98 100644 --- a/proposals/4259-bulk-profile-sync.md +++ b/proposals/4259-bulk-profile-sync.md @@ -1,4 +1,4 @@ -# MSC4259: Profile Sync API for Federation +# MSC4259: Bulk Profile Sync API for Federation Currently, homeservers must individually request profiles via [`/_matrix/federation/v1/query/profile`](https://spec.matrix.org/v1.13/server-server-api/#get_matrixfederationv1queryprofile) From e4d610701b1310e37c35aa0d6a104f9985e52f54 Mon Sep 17 00:00:00 2001 From: Tom Foster Date: Thu, 30 Jan 2025 14:46:52 +0000 Subject: [PATCH 3/5] Stop comment showing red --- proposals/4259-bulk-profile-sync.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/4259-bulk-profile-sync.md b/proposals/4259-bulk-profile-sync.md index 2be72639a98..6681d2c5296 100644 --- a/proposals/4259-bulk-profile-sync.md +++ b/proposals/4259-bulk-profile-sync.md @@ -55,7 +55,7 @@ requests and returns only changed data on subsequent requests using sync tokens. #### Request Format -```json +```json5 { "last_batch": "opaque_server_token", // Optional sync token from previous response "accounts": [ // Required array of MXIDs From 74f4b78993aed6a272391644dd1a56f1a9dc08aa Mon Sep 17 00:00:00 2001 From: Tom Foster Date: Mon, 3 Feb 2025 12:29:05 +0000 Subject: [PATCH 4/5] Replaced pull-method with EDU after feedback --- proposals/4259-bulk-profile-sync.md | 202 +++++++++------------------- 1 file changed, 65 insertions(+), 137 deletions(-) diff --git a/proposals/4259-bulk-profile-sync.md b/proposals/4259-bulk-profile-sync.md index 6681d2c5296..ec82bb3b25d 100644 --- a/proposals/4259-bulk-profile-sync.md +++ b/proposals/4259-bulk-profile-sync.md @@ -1,18 +1,20 @@ -# MSC4259: Bulk Profile Sync API for Federation +# MSC4259: Profile Update EDUs for Federation Currently, homeservers must individually request profiles via [`/_matrix/federation/v1/query/profile`](https://spec.matrix.org/v1.13/server-server-api/#get_matrixfederationv1queryprofile) -when they need to display user information. This leads to significant inefficiencies in the Matrix -federation: servers must make separate requests for each profile they need, cannot efficiently -detect when profiles change, and risk serving stale data to their users. The problem becomes more -acute as servers cache profile data to reduce federation traffic, requiring careful balance between -cache duration and data freshness. - -This proposal introduces a new federation endpoint that allows homeservers to efficiently sync -profile data from multiple users in a single request. It builds upon -[MSC4133](https://github.com/matrix-org/matrix-spec-proposals/pull/4133) which introduced extended -profile fields, and adds a token-based sync mechanism that lets servers track profile changes -without repeatedly requesting unchanged data. +when they need to display user information. + +This can lead to significant inefficiencies in Matrix federation, as servers must make separate +requests for each profile they need, cannot efficiently detect when profiles change, and thus risk +serving stale data to their users. The problem becomes more acute as servers cache profile data to +reduce federation traffic, requiring a careful balance between cache duration and data freshness. + +This issue has always existed to some degree, but has gained urgency to accommodate profile changes +introduced in [MSC4133](https://github.com/matrix-org/matrix-spec-proposals/pull/4133). + +This proposal specifically focuses on efficient delivery of profile updates between servers. While +client delivery of profile updates is also important, that solution is to be addressed separately, +such as through an extension to [MSC4186](https://github.com/matrix-org/matrix-spec-proposals/pull/4186). ## Proposal @@ -23,164 +25,90 @@ The current approach to federation profile lookups has several issues: - Difficulty in maintaining accurate (non-stale) cached copies of remote profiles - No standardised way to efficiently detect when profiles have been updated -This MSC adds a new `/_matrix/federation/v1/profiles` endpoint that accepts batched profile -requests and returns only changed data on subsequent requests using sync tokens. +This MSC adds a new `m.profile` EDU type that servers generate when users update their profiles, +allowing real-time notifications of profile changes between servers. ### Behaviour -1. The requesting server provides an array of MXIDs it wishes to sync profiles for - -2. If `last_batch` is provided, only profiles that have changed since that sync token are returned - -3. The responding server: - - MAY filter which profiles it returns based on its policies - - MAY return partial results and indicate this via `next_batch` - - MUST include `next_batch` in successful responses - - MUST NOT return profiles unless they were requested - - SHOULD indicate users that are not present on the server - - SHOULD suggest an appropriate `next_time` for subsequent requests +1. When a user updates their profile via client-server APIs, their homeserver: + - Processes and stores the profile update as per normal + - MAY generate EDUs to notify other servers + - If sending EDUs, SHOULD only send them to servers that share a room with the user, + filtering/ratelimiting/de-duplicating the broadcasts as needed to meet their own policies -4. Missing profiles in the response indicate no changes since `last_batch` +2. Remote servers receiving the EDU: + - MAY use these EDUs to update cached profile data, and thus hold onto cached profiles for longer + - COULD ignore the EDU if they are not interested in updates for this user -5. Error responses for specific profiles suggest excluding them from future requests +3. EDU handling: + - Servers MAY cache profile data from EDUs to reduce future federation traffic + - Profile information is considered public, so servers MAY broadcast to any known server + - Servers MUST NOT forward EDUs to other servers -### Federation API Changes - -#### New Endpoint - -- **Method**: `POST` -- **Endpoint**: `/_matrix/federation/v1/profiles` -- **Auth**: Standard federation authentication -- **Rate Limiting**: Implementation-defined, with server guidance via `next_time` response field - -#### Request Format +### EDU Format ```json5 { - "last_batch": "opaque_server_token", // Optional sync token from previous response - "accounts": [ // Required array of MXIDs - "@alice:example.com", - "@bob:example.com" - ] -} -``` - -#### Response Format - -```json -{ + "type": "m.profile", "content": { - "@alice:example.com": { + "user_id": "@alice:example.com", + "fields": { "displayname": "Alice", - "avatar_url": "mxc://example.com/alice", + "avatar_url": null, // Signals removal of avatar_url "org.example.language": "en-GB" - }, - "@bob:example.com": { - "errcode": "M_NOT_FOUND", - "error": "User not found" } - }, - "next_batch": "opaque_server_token_2", - "next_time": "2024-03-14T12:30:00Z" + } } ``` -#### Error Codes - -Inside profile payloads, standard Matrix error codes are used for errors: - -- `M_FORBIDDEN`: "User profile is not accessible to the requesting server" -- `M_NOT_FOUND`: "User does not exist" - -For entire-endpoint errors, the following standard Matrix error codes are used: - -- **403 response**: - - `M_FORBIDDEN`: "Profile lookup over federation is disabled on this homeserver" - -- **404 response**: - - `M_UNKNOWN_TOKEN`: "Profile sync token unknown or expired" - -- **429 response**: - - `M_LIMIT_EXCEEDED`: "Too many requests" (should use `retry_after_ms` to indicate time to retry) +The EDU contains only fields that have changed. Fields set to `null` should be considered removed. +Omitted fields should be considered unchanged. Recipients can fetch the full profile using the +existing federation API if they need to verify the complete state. ### Implementation Notes -- When no `last_batch` parameter is provided, return current profile data for all requested - accounts. This is typically used for initial sync or to start over when all tokens have expired. - -- Servers MUST maintain a history of sync tokens to allow clients to resume syncs after connection - issues. Servers SHOULD store at least the last 3-5 tokens to provide resilience against temporary - network failures. +- All profile field values follow the same validation rules as the existing profile endpoints -- Servers MAY expire old tokens based on their own policies (e.g. time-based expiry or storage - limits). When a client provides an expired token, servers MUST return `M_UNKNOWN_TOKEN` to prompt - the client to restart their sync chain. +- Servers should implement appropriate rate limiting for EDU generation/sending, and MAY delay + notifications to de-duplicate/combine multiple field updates into a single EDU -- The implementation of tokens is left to the server - they could be timestamps, change IDs, or any - other mechanism that allows tracking profile changes. The only requirement is that using a token - in a subsequent request returns all profile changes since that token was issued. +- While this EDU system reduces the need for manual profile requests, implementations should note: + - Remote servers may not support or send these EDUs + - EDUs can occasionally fail to be delivered + - Both servers and clients COULD implement periodic re-fetching of profiles (e.g. weekly or + monthly) if they require stronger consistency guarantees + - The frequency of such re-fetching should be balanced against available resources, network + conditions, and desired data freshness -- All timestamps (e.g. in `next_time`) MUST be UTC and include the 'Z' suffix. - -- The `next_time` field is advisory; servers are recommended to also enforce their own rate limits. - -- Servers MAY redact certain profiles, e.g. refusing to return profiles for users not sharing rooms - with the requesting server. - -- Requesting servers COULD optimise their sync frequency based on user activity, for example: - 1. Reducing frequency or temporarily suspending syncs when users are offline - 2. Resuming more frequent syncs when users become active again +- Profile updates are typically much less frequent than other EDU types like presence updates, + so broadcasting these small delta updates to servers sharing rooms with the user is considered + efficient and scalable. Receiving servers can quickly filter unwanted updates with minimal + processing overhead. ## Security Considerations -1. This endpoint follows standard federation authentication rules +1. EDUs follow standard federation authentication rules -2. Servers maintain control over which profiles they expose +2. Profile information is considered public data in Matrix 3. Rate limiting helps prevent abuse ## Alternatives -1. Timestamp-based syncing - - Less reliable for tracking exact changes - - Harder to handle clock skew between servers - - No built-in mechanism for detecting missed updates - - More complex to implement correctly across timezones - -2. EDU-based profile updates - - Higher bandwidth usage due to individual EDU requests (especially for each changed field, - if a user updates multiple fields which a pull-based endpoint would help consolidate) - - Requires servers to send EDUs the recipient may not be interested in - - No way to quickly recognise when a recipient is permanently offline to stop sending EDUs - - Difficult for servers to know how long to cache if sender may have stopped sending EDUs - - Profile updates do not necessarily need to be instantaneous like typing notifications - -3. Webhook-style push notifications for profile changes +1. Pull-based sync API + - Could be more efficient for small servers that want to sync on their own schedule + - Allows servers to batch multiple profile requests + - However, doesn't scale well to servers with thousands of users + - Complex to maintain filter lists of which users each server wants updates for + - Higher latency for profile updates compared to EDU-based approach + +2. Webhook-style push notifications for profile changes - No precedent for this in the current Matrix specification - - Similar issues to EDUs regarding bandwidth and wasted outbound traffic + - Similar scaling issues to pull-based approach regarding filter lists + - Would require new federation endpoints and authentication mechanisms ## Unstable Prefix -Until this proposal is stable: - -- Endpoint: `/_matrix/federation/unstable/uk.tcpip.msc4259/profiles` -- Client feature flag: `uk.tcpip.msc4259` - -### Feature Flag Advertisement - -Servers implementing this endpoint MUST advertise support via the `/_matrix/federation/v1/version` endpoint: - -```json -{ - "server": { - "name": "Synapse", - "version": "1.99.0" - }, - "unstable_features": { - "uk.tcpip.msc4259": true - } -} -``` +Until this proposal is stable, use: -Once this MSC is merged, servers SHOULD advertise `uk.tcpip.msc4259.stable` until the next spec -version where these endpoints are officially written into the spec. +- EDU type: `uk.tcpip.msc4259.profile` From aadfe57aee04c2ffb73745e43ea03c8ba498220e Mon Sep 17 00:00:00 2001 From: Tom Foster Date: Mon, 3 Feb 2025 13:21:03 +0000 Subject: [PATCH 5/5] Reference MSC4262 --- proposals/4259-bulk-profile-sync.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/proposals/4259-bulk-profile-sync.md b/proposals/4259-bulk-profile-sync.md index ec82bb3b25d..1068c7b628c 100644 --- a/proposals/4259-bulk-profile-sync.md +++ b/proposals/4259-bulk-profile-sync.md @@ -14,7 +14,8 @@ introduced in [MSC4133](https://github.com/matrix-org/matrix-spec-proposals/pull This proposal specifically focuses on efficient delivery of profile updates between servers. While client delivery of profile updates is also important, that solution is to be addressed separately, -such as through an extension to [MSC4186](https://github.com/matrix-org/matrix-spec-proposals/pull/4186). +such as through a sliding sync extension like +[MSC4262](https://github.com/matrix-org/matrix-spec-proposals/pull/4262). ## Proposal