Add semantic tokens #39539

macmv · 2025-10-05T01:18:18Z

Closes #7450.

This PR adds semantic tokens from language servers to the editor.

With semantic tokens on the left, without on the right:

I find this a very useful feature for the following reasons:

Highlighting functions from the standard library differently from user-defined functions makes code a lot easier to scan at a glance. My eyes can just skip over .map() and not need to worry that it's a user-defined function.
Making parameters/local variables a different color is quite handy when I'm trying to parse some dense code.
Differentiating function calls/properties (specifically in rust) makes code a lot easier to read at a glance.

Overall, while it is a quality of life feature, I find code a lot easier to understand at a glance with semantic tokens.

There are still some things left to finish:

Remove the semantic token range request. I intended to use it for multibuffers, but I think it's simpler (and performs well enough) to just grab all the semantic tokens for each buffer. Semantic token ranges also cannot be diffed, so we'd need to fetch all the semantic tokens if the user modified the multibuffer at all.
Improve perf. For large files (10,000+ lines), I am starting to notice a bit of delay when moving the cursor around. This was much worse in my first draft, where I made an anchor for every token, but I still think it can be improved. Right now, the editor feels a lot more clunky, just from that slight extra delay from these tokens.
Remote editors. I added the protobufs, but I never actually handled the requests on the server side. I'm not sure if the server should respond with its locally cached tokens or send a fresh request to the LSP server.

Edit: Perf got fixed by clamping the buffer ranges in create_hightlight_endpoints correctly. I don't notice any change now.

Aside from that, it works nicely and highlights all the tokens I could think of.

In terms of conflicting with tree-sitter, these highlights are applied ontop of tree-sitter highlights. Additionally, the LSP spec documents how this should be handled. Within the initialization options of the semantic tokens provider, there is the following option:

	/**
	 * Whether the client uses semantic tokens to augment existing
	 * syntax tokens. If set to `true` client side created syntax
	 * tokens and semantic tokens are both used for colorization. If
	 * set to `false` the client only uses the returned semantic tokens
	 * for colorization.
	 *
	 * If the value is `undefined` then the client behavior is not
	 * specified.
	 *
	 * @since 3.17.0
	 */
	augmentsSyntaxTokens?: boolean;

I set this to true, so language servers won't send us semantic tokens for obvious things like strings or numbers. In particular, here's how rust-analyzer handles that flag: https://github.com/rust-lang/rust-analyzer/blob/4ae99f0150c94f4bdf7192b4447f512ece3546fd/crates/rust-analyzer/src/lsp/to_proto.rs#L734-L752

This is best reviewed commit-by-commit. Each commit builds on the last, and they're somewhat organized so that the project should compile between each commit.

Release Notes:

Added Semantic Tokens.

cla-bot · 2025-10-05T01:18:22Z

We require contributors to sign our Contributor License Agreement, and we don't have @macmv on file. You can sign our CLA at https://zed.dev/cla. Once you've signed, post a comment here that says '@cla-bot check'.

macmv · 2025-10-05T01:23:05Z

@cla-bot check

cla-bot · 2025-10-05T01:23:09Z

The cla-bot has been summoned, and re-checked this pull request!

finico · 2025-10-08T13:23:58Z

@macmv First of all, thank you for the work done! I would really like to see this feature in zed.
Was this supposed to be with lazy initialization?
After opening the file and navigating the highlight is only basic via tree-sitter and the semantic one is rendered after any change (I tried it in several languages, it does not depend on LSP)

macmv · 2025-10-08T19:03:41Z

Right now I only wired up the semantic tokens to the Editted event. I've been tinkering with fetching semantic tokens when an editor is created as well, but that isn't always consistent when a language server is still starting, so it requires a bit more tinkering to get right.

SomeoneToIgnore

Sorry for the delayed review — I was working on inlay hints to provide an example how such range-related LSP things are planned to be cached and used in the system.

I was not too meticulous about the protocol part, as hope @Veykril will check the PR too at some point.

Overall, seems like a step forward compared to #27556 as we have delta support and the editor-rendering part is somewhat more elaborate here.

I think it's perfectly fine to ignore the proto part for now (albeit there are collab/src/editor_tests.rs and local ssh to test on locally and progress on that).

To resume my comments, I think those steps need to be done:

consolidate all LSP data caching and code near LspStore instead of editor to share it — consider range requests
iron out editor store: when things are updated, range request story part (invalidation is somewhat a pain for inlay hints)
add refresh support and multi buffer test

I think, we can try and consider inlay hints, chunks and whatever related logic added recently: it's not fully ironed out to be a generic storage for semantic tokens, but the LspStore part seems good enough — editor part seems ok to start off from.

Besides that, one huge gap here is the user-facing part of the feature:

no settings (in default.json file we have a bunch of scattered entries, we need to pick something near lsp and highlights semantically, and add a way to turn things on/off, switch between tree-sitter, etc.)
no documentation (we document settings mainly)
seems worthy to add another entry to

for the feature visibility.

last but not least, it would be superb to be able to do things like

"editor.semanticTokenColorCustomizations": {
    "enabled": true,
    "rules": {
      "unresolvedReference": {
        "foreground": "#c93f3f",
        "fontStyle": "bold"
      }
    }
  },

as VSCode allows.
This snippet turns rust-analyzer's highlights into a quite good and useful diagnostic.

I plan to work on inlay hints (there's an invalidation bug somewhere) and rainbow brackets further, so no active work and thinking on this PR yet.
Happy to discuss things further.

SomeoneToIgnore · 2025-10-23T21:13:39Z

crates/editor/src/display_map/custom_highlights.rs

+            }
+        }
+    }
+    highlight_endpoints.sort();


With

impl Ord for HighlightEndpoint { fn cmp(&self, other: &Self) -> cmp::Ordering { self.offset .cmp(&other.offset) .then_with(|| self.style.is_some().cmp(&other.style.is_some())) } }

this part of code starts to become more dangerous with each extra semantics layer added.

The main question here to me is: how to unite both tree-sitter and semantics highlight so that people can customize them?

It is not unrealistic to assume there are people who want 0 tree-sitter used in such highlights: augmentsSyntaxTokens was mentioned, but no code is added to the configuration level yet.

Would it make sense to use a similar approach

zed/assets/settings/default.json

Lines 1519 to 1528 in 1e58fe7

// Use LSP tasks over Zed language extension ones.

// If no LSP tasks are returned due to error/timeout or regular execution,

// Zed language extension tasks will be used instead.

//

// Other Zed tasks will still be shown:

// * Zed task from either of the task config file

// * Zed task from history (e.g. one-off task was spawned before)

//

// Default: true

"prefer_lsp": true

to highlights too?

This is a good point. I figured doing what "the lsp spec suggests" would be a good start, but configurability would be best.

How does this sound: add two flags, one to enable tree-sitter highlights and one to enable semantic token highlights. Then we enable augmentSyntaxTokens only when both are enabled and layer semantic token highlights on top of tree-sitter highlights. This behavior can all be documented in default.json.

Additionally, enabling semantic token highlights would be a per-language-server setting, and toggling tree-sitter highlights would be a per-language setting (both with global defaults).

I realize users could theoretically want tree-sitter highlights on top of semantic tokens, but at that point, I feel like they should disable semantic tokens for a specific language server instead. I hesitate to add more configuration than necessary, just because all the combinations of settings become difficult to test.

add two flags, one to enable tree-sitter highlights and one to enable semantic token highlights

This sounds good.

To agree more with your last paragraph, I'd try to use a single one with some enum values — the main idea here is to have some "fallback" by default, so we start with semantic highlights on always, but if those are not supported by the language's server, we can fall back to tree-sitter.

I do not insist on this much if it starts to get hard to implement, but in the end it seems reasonable to have something like this (same Zed does with prettier and those tasks mentioned above).

This behavior can all be documented in default.json.

Great that it will be done, just want to mention that this file is 50% of the docs, there's also configuring-zed.md and other docs/ entries that get published to website and might be more visible for certain kinds of people, hence we should have both updated.

enabling semantic token highlights would be a per-language-server setting

With that, I'd rather not agree: we can as well make it language-dependent and spare users from remembering server names.
Also, somewhat plays towards the unified setting, if it works out.

SomeoneToIgnore · 2025-10-23T21:28:28Z

crates/editor/src/display_map/custom_highlights.rs

+                }
+                let token_range = excerpt.map_range_from_buffer(token_range);
+
+                if token_range.end <= range.start || token_range.start >= range.end {


Range is an exclusive range, but it does include its start — seeing both ends considered here is a bit odd, is it supposed to be so?

yeah good point, I think < and > are correct

SomeoneToIgnore · 2025-10-23T21:37:35Z

crates/editor/src/display_map.rs

 }

+#[derive(Debug, Default)]
+pub struct SemanticTokenView {


In this repository, View is rather common name for entities that are displayed on-screen: there's ToolbarItemView, ModalView, StatusItemView, etc.

Here, we deal with one such "view" called editor element, where small parts of it are being colored differently, so the name seems a bit off.

Maybe, BufferSemanticTokens or SemanticTokens?

I ended up needing BufferSemanticTokens in lsp store, so how does PositionedSemanticTokens sound?

crates/editor/src/display_map.rs

SomeoneToIgnore · 2025-10-23T21:59:11Z

crates/editor/src/display_map/custom_highlights.rs

+            let buffer_range = excerpt
+                .buffer()
+                .range_to_version(excerpt.map_range_to_buffer(range.clone()), &tokens.version);
+            for token in tokens.tokens_in_range(buffer_range) {


tokens_in_range

I know that certain highlights (e.g. word highlight on hover) are applied after a debounce.
Based on my experience of fixing various obscure bugs like #40738 , it's not unreasonable to assume that editing and highlights might interleave in such manner, that there is a buffer's Global version mismatch between what's highlighted and what's stored in the data.

Here, it seems relatively harmless as would not panic due to anchor/snapshot manipulations, but still potentially buggy?
If so, some extra version check could help — ideally, accessors like

zed/crates/project/src/lsp_store.rs

Lines 12261 to 12274 in f0ac54e

/// Gets the most recent LSP data for the given buffer: if the data is absent or out of date,

/// new [`BufferLspData`] will be created to replace the previous state.

pub fn latest_lsp_data(&mut self, buffer: &Entity<Buffer>, cx: &mut App) -> &mut BufferLspData {

let (buffer_id, buffer_version) =

buffer.read_with(cx, |buffer, _| (buffer.remote_id(), buffer.version()));

let lsp_data = self

.lsp_data

.entry(buffer_id)

.or_insert_with(|| BufferLspData::new(buffer, cx));

if buffer_version.changed_since(&lsp_data.buffer_version) {

*lsp_data = BufferLspData::new(buffer, cx);

}

lsp_data

}

could help?

SomeoneToIgnore · 2025-10-23T22:49:57Z

crates/lsp/src/lsp.rs

                        dynamic_registration: Some(true),
                    }),
+                    semantic_tokens: Some(SemanticTokensClientCapabilities {
+                        dynamic_registration: Some(true),


Just to double check: I do not see register_server_capabilities adjusted, but it should be, right?

Oh good point, I missed that.

I don't entirely understand register_server_capabilities for semantic tokens. Will the client send a registration for textDocument/semanticTokens/? or textDocument/semanticTokens/full?

I can't really find how this works in the spec, and I'm not sure how to go about testing this.

Dynamic registrations are somewhat convoluted indeed, I can only propose to check out #41929

I expected that to be very similar to

zed/crates/project/src/lsp_store.rs

Lines 12088 to 12094 in 64d782e

"textDocument/documentSymbol" => {

let options = parse_register_capabilities(reg)?;

server.update_capabilities(|capabilities| {

capabilities.document_symbol_provider = Some(options);

});

notify_server_capabilities_updated(&server, cx);

}

Worst case — we can use Some(false) in the dynamic registration capability part.

SomeoneToIgnore · 2025-10-23T22:50:39Z

crates/lsp/src/lsp.rs

+                        // We apply semantic tokens over tree-sitter highlighting, so
+                        // let the server know that we are augmenting existing tokens.
+                        augments_syntax_tokens: Some(true),
+                    }),


We should declare workspace/refresh support too and implement it, as we do that already for inlays and other cases.

SomeoneToIgnore · 2025-10-23T22:54:09Z

crates/project/src/lsp_store/semantic_tokens.rs

+#[derive(Default, Debug, Clone)]
+pub struct SemanticTokens {
+    /// Each value is:
+    /// data[5*i] - deltaLine: token line number, relative to the start of the previous token


Feels worth a few unit test around this logic (maybe, on a level above, but somewhere, we do have to test delta and other cases, e.g. the one described in a spec?)

Thank you for adding tests both on the LSP part the "unit" tests on logic from this module.

Yet, I have failed to find any "unit" tests on applying delta requests on top of the partial one, and the spec seem to describe some example?

That looked somewhat complex to me, and if I'm not confusing things, would be great to add another test.

SomeoneToIgnore · 2025-10-23T22:59:08Z

crates/project/src/lsp_store.rs


+#[derive(Debug, Default)]
+struct SemanticTokensData {
+    result_id: Option<String>,


It's a bit odd to see this optional, we do know which server did we send the request to and got the response from.

Yup, agreed. This was just so I could #[derive(Default)]. Now that LspData exists, it should be easy to get rid of this.

Oh, sorry, I got this mixed up with server_id. The result_id does need to be optional: it's an optional field in the spec. If a server doesn't support deltas, it won't send back result ids. But I have a similar server_id field in the SemanticTokens struct, which I will remove the optional from.

SomeoneToIgnore · 2025-10-23T23:02:25Z

crates/project/src/lsp_store.rs

+            })
+        } else {
+            let lsp_request_task =
+                self.request_lsp(buffer, LanguageServerToQuery::FirstCapable, request, cx);


So good to see full and delta support, thank you.

This is somewhat debatable though: in the world of frontend, a single *.ts file may have various language servers associated (tailwind-language-server, typescript-language-server, eslint, etc.) and it's possible to expect any of them may server additional tokens too?

I would repeat what current main does for inlay hints.

macmv · 2025-10-26T09:14:25Z

Thank you so much for the detailed review! It'll take me a few days to rebase/update everything, and this looks like a bunch of interesting stuff to read up on.

I'm aware of the missing user-facing configs, and I think I'll try and tackle that while I'm updating these bits. This was mostly meant as a POC, as I was still just interested to see if this was possible. Being able to customize language-server-specific tokens looks super useful.

I'd also definitely like to omit the proto part for now. This PR is getting quite long. Should I keep what I have already? Or just cut out the new protobufs entirely?

And finally, do you prefer rebases or merges?

SomeoneToIgnore

It took me eons to get here, no rush (albeit now I'll try to keep up with the pace and review further changes timely).

Or just cut out the new protobufs entirely?

I'm not sure how possible this is, given that we'll have to use Lsp* infra?
We have to use LspQuery for properly made local requests, that drags some proto types one way or another.
I think we could literally copy paste document colors/inlays here: the latter is a bit more complex as involves some ranges handling, and the former is more or less the same modulo the LSP data is different.

I do not mind removing the layer/ignoring it until the very end if you find the way — in the end, we'll have to get a collab/src/tests/editor_tests.rs entry for remote functionality testing, which I can help with.

do you prefer rebases or merges

I think merges are more realistic given that this branch will be here for a while, but no preferences really — I prefer to concentrate on the feature itself and its semantics more.

SomeoneToIgnore · 2025-10-27T10:16:52Z

crates/editor/src/display_map/custom_highlights.rs

+            }
+        }
+    }
+    highlight_endpoints.sort();


add two flags, one to enable tree-sitter highlights and one to enable semantic token highlights

This sounds good.

To agree more with your last paragraph, I'd try to use a single one with some enum values — the main idea here is to have some "fallback" by default, so we start with semantic highlights on always, but if those are not supported by the language's server, we can fall back to tree-sitter.

I do not insist on this much if it starts to get hard to implement, but in the end it seems reasonable to have something like this (same Zed does with prettier and those tasks mentioned above).

This behavior can all be documented in default.json.

Great that it will be done, just want to mention that this file is 50% of the docs, there's also configuring-zed.md and other docs/ entries that get published to website and might be more visible for certain kinds of people, hence we should have both updated.

enabling semantic token highlights would be a per-language-server setting

With that, I'd rather not agree: we can as well make it language-dependent and spare users from remembering server names.
Also, somewhat plays towards the unified setting, if it works out.

SomeoneToIgnore · 2025-10-27T10:19:24Z

crates/editor/src/display_map.rs

+        // [2]: https://microsoft.github.io/language-server-protocol/specifications/lsp/3.17/specification/#semanticTokenTypes
+        let choices: &[&str] = match self.token_type(token_type)? {
+            // Types
+            "namespace" => &["namespace", "module", "type"],


I guess here I lack the context — overall, there's definitely some configuration has to happen (as I've posted that VSCode token settings snippet for r-a) so not opposed to the idea overall.

SomeoneToIgnore · 2025-10-27T10:35:07Z

crates/editor/src/editor.rs


+    fn semantic_tokens(
+        &self,
+        buffer_handle: Entity<Buffer>,


Deltas only work on the entire buffer. So, if the server supports deltas, range supports are basically never useful (as ranges cannot be diffed).

Did not know that, thank you: that complicates things even more, technically, as multi buffers come into play.

I's quite sad if we do some project search that hits many large files — then, when we scroll their excerpts into view, we will have to load the entire file to show a small (<20 lines, usually) excerpt per hit.

Also, the "what if some slow server starts to choke on large files" concern is still here.

Given such strong delta separation though, I think it's fine to move on now and concentrate on full files + deltas, this is a big change alone.

This also simplifies all caching/metadata quite a lot, given that we have no ranges to deal with.

But then, let's remove the range: Some(true), from our LSP capabilities

cc @Veykril as this is quite a pivoting point then, and you know the spec shenanigans here better

SomeoneToIgnore · 2025-10-29T19:34:25Z

Ok, given #39539 (comment) I think we're getting quite close:

let's ignore all range-related shenanigans that inlays do, same as all range-related semantic tokens queries
and, instead, concentrate on what document highlights do

LspStore and caching is also done for those, and update strategies seem to be similar to what colors do.

macmv · 2025-10-31T02:30:49Z

Alrighty, that sounds good. I'll polish this up and update this soon.

macmv · 2025-11-01T22:58:28Z

I've posted a new version. This debounces updates in LspStore and aggregates semantic tokens from multiple servers correctly. This largely copies the behavior of document tokens. Notably, when requesting from multiple language servers, semantic tokens are only shown once all servers reply. For each server, it decides to send a full/delta request accordingly.

I added some more tests:

One test has a multibuffer that shows the same buffer twice. This test shows that the deduplication in lsp store is definitely needed, and it works as expected.
One test has a buffer with two language servers attached, and this tests aggregating tokens from both servers. The data structures are in place to make this deterministic (there's an IndexMap<LanguageServerId, BufferSemanticTokens>), but right now the server that takes priority is the one that responds slowest (I think?).
One test has a multibuffer with two different languages and language servers attached to each. This is mostly just combining everything and seeing what happens.

I still need the update_semantic_tokens task in the editor so that I can aggregate all the buffer changes for multibuffers (as LspStore is only concerned with individual buffers). This task is not debounced at all, as it relies on the lsp store debouncing.

Additionally, latest_lsp_data tripped me up a bit, as it clears the previous lsp data every time the buffer changes. I changed this to keep semantic tokens around when updating the BufferLspData, but I'm not sure if that makes sense.

And next, I'm going to work on configs. I haven't started that yet.

reflectronic · 2025-11-04T07:43:12Z

Just leaving my two cents: I hope that we are not locking ourselves out of implementing textDocument/semanticTokens/range support in the future. In particular, I know that the C# language server does not support the delta method at this time (dotnet/roslyn#70536). My understanding is that delta performs poorly with large files, because it still requires computing semantic analysis for the entire file on each edit. The range request allows the computations to be limited to the visible section of the document (e.g. if semantic analysis is implemented in a lazy manner, like it is in the C# compiler).

SomeoneToIgnore

Superb, the new tests and most of the LspStore-related changes look solid, we're definitely on the right track.

One small concern is the text.rs-related changes, feels that we can inline this all into highlights-related code?

Additionally, latest_lsp_data tripped me up a bit, as it clears the previous lsp data every time the buffer changes. I changed this to keep semantic tokens around when updating the BufferLspData, but I'm not sure if that makes sense.

This seems somewhat odd, I've commented around the potentially hacky place to discuss it more.
Just to note, there's current_lsp_data to look up what's stored inside without invalidating, so that should help with whatever remaining issues, it seems.

And next, I'm going to work on configs. I haven't started that yet.

One small note here (given the C#-related language server comment above): maybe, we should disable semantic highlights by default, until we support range requests.
If we add the docs and that editor menu item I've mentioned before, that should be discoverable enough for the time being.

SomeoneToIgnore · 2025-11-04T21:53:32Z

crates/editor/src/editor.rs

+        };
+
+        let buffer_id = buffer.read(cx).remote_id();
+        // `semantic_tokens` gets debounced, so we can call this frequently.


I have descended down into lsp_store.rs and did not notice any new .timer( call, can you point me to the debounce?

oh, yeah I didn't add the .timer() call. i'll put it in lsp store.

SomeoneToIgnore · 2025-11-04T22:07:21Z

crates/lsp/src/lsp.rs

                        dynamic_registration: Some(true),
                    }),
+                    semantic_tokens: Some(SemanticTokensClientCapabilities {
+                        dynamic_registration: Some(true),


Dynamic registrations are somewhat convoluted indeed, I can only propose to check out #41929

I expected that to be very similar to

zed/crates/project/src/lsp_store.rs

Lines 12088 to 12094 in 64d782e

"textDocument/documentSymbol" => {

let options = parse_register_capabilities(reg)?;

server.update_capabilities(|capabilities| {

capabilities.document_symbol_provider = Some(options);

});

notify_server_capabilities_updated(&server, cx);

}

Worst case — we can use Some(false) in the dynamic registration capability part.

SomeoneToIgnore · 2025-11-04T22:18:58Z

crates/text/src/text.rs

        false
    }

+    pub fn offset_to_version(&self, offset: usize, version: &clock::Global) -> usize {


This one is not really used anywhere but in tests?

range_from_version is used only once in the highlights-related places, can we have a function there instead of adding any public API here?

Those .unwrap() look somewhat concerning too — let's use Option and whatever else instead.

Yeah sure thing, I'll move it. I think it started out needing private APIs from BufferSnapshot, but it doesn't now. I'll move it up to where highlights are built.

SomeoneToIgnore · 2025-11-04T22:21:43Z

crates/project/src/lsp_store.rs

            .entry(buffer_id)
            .or_insert_with(|| BufferLspData::new(buffer, cx));
        if buffer_version.changed_since(&lsp_data.buffer_version) {
+            // Semantic tokens survive buffer version changes.


That seems odd and somewhat hacky: let's discuss things more.

Presumably, all this is needed to avoid flickering?
If so, can we do things similar to inlays/document colors, where we invalidate data & tasks in LspStore as soon as the version mismatch is found; yet we keep the old data in the Editor instead, to use in the rendering/wherever, and only update that after the next fetch task gets back the results.

Or, is it so that we're able to get proper deltas for changed documents?
Then, not hacky at all, just not obvious — let's mention that specifically instead.

Yes, this is for deltas to work at all. It uses the result_id of the semantic tokens to send the delta request. Even if I only use current_lsp_data, other calls to latest_lsp_data (like in document colors) remove the previous result_id, and we don't end up sending delta requests.

I'll specifically call that out in a comment.

macmv · 2025-11-05T02:09:16Z

not locking ourselves out of implementing textDocument/semanticTokens/range support in the future

I don't think this is locking out of supporting range requests, but for servers that don't support deltas, I feel like this requires an entirely different approach. I'd imagine we would want an editor to query for specific ranges, as the scroll position of the editor changes the tokens it sees. That also means that as the editor scrolls, it would need to request additional ranges and then merge the tokens between multiple queries for rendering. And then, when the editor is modified, all the tokens would just get discarded. I think there is very little overlap between range logic and the delta logic that I've implemented.

Oh, and I think minimap rendering would need to be changed. If the minimap is open, requesting "only the visible range" ends up being a large portion of the file anyways.

I'm happy to disable semantic tokens until it's more complete. I don't want to bring language servers grinding to a halt with a new update. Alternatively, we could disable semantic tokens by default on servers where this would be a problem, and leave it enabled for other servers by default.

macmv · 2025-11-05T05:48:19Z

I added some configuration. I couldn't find a toggle for tree-sitter highlights, so I just added a semantic_tokens: bool on the language configuration. Then, within an editor, semantic token requests will only be made for languages that have the flag enabled. It is defaulted to false.

Additionally, in the editor controls, I added a toggle for semantic tokens within that editor:

If that toggle is disabled, no semantic tokens will be rendered. If it is enabled, then semantic tokens will be rendered for any languages that have the semantic_tokens setting enabled.

I added a similar toggle to the debug syntax tree:

This one looks a little bare, being the only setting, but I figured a toggle button would look odd.

Finally, I wired up refreshes to all the appropriate events, so now semantic tokens show up right when you load the editor.

SomeoneToIgnore · 2025-11-05T20:49:21Z

I did not dive into the code too much since last review, but in general looks very close to a good first iteration on the feature, nice!

To double check: is token configuration like #42030 (and the one I've mentioned earlier, related to ra) something that will be a part of the configuration-related changes?

If not, what's the hard part there, given that we seem to have all the hardcoded values listed?

macmv · 2025-11-05T20:56:50Z

I wasn't planning on adding that, but then it turned out to be easier than I thought. I just got it working with hardcoded colors (ie, you can specify an RGB color in the config). It works quite nicely with the example you posted:

The above is with the following config:

  "lsp_semantic_token_customizations": {
    "enabled": true,
    "rules": [
      {
        "token_type": "unresolvedReference",
        "foreground_color": "#c93f3f"
      }
    ]
  },

Each rule has the following properties:

token_type: The LSP semantic token type to customize.
foreground_color: The foreground color to use for the token type.
background_color: The background color to use for the token type.
underline: A color to underline the type with.
font_style: One of "normal", "bold", "italic", or "bold_italic".

I think I'll add a token_modifiers item to each rule, so you can match on those as well. But this doesn't hook into the current highlighting theme. I suppose I could add a theme key into those rules, which would be used when colors aren't specified? I'm not quite sure what this should look like.

SomeoneToIgnore · 2025-11-05T21:32:13Z

Wow, that's quite the first step for the tokens feature, I'd say.

I wonder if lsp_semantic_token_customizations could be semantic_tokens under "global_lsp_settings" key?
Otherwise, the propsal looks good, same as the theme custom key: as an alternative (not insisting) we could use null as "nothing, fallback to theme" special value?

I would also double check what VSCode does: we have vscode_import.rs module, where it should not be too hard to do the same for the tokens' settings, hopefully.
We do not have to support the whole set of those (would be curious to know the discrepancies, though), just what we have already.

The best part would be to document this both in the json and md docs — I can help with that, if needed, same as the RPC part.
If any tests start to fail suddenly, I can help with that: I think that should be straightforward, except maybe a need to write

cx.executor().advance_clock(Duration::from_secs(LSP_QUERY_TIMEOUT));
cx.run_until_parked();

in places to ensure the tokens are queried for.

Will wait a week or so, and also @Veykril 's review as we're getting close and would be great to see if we've missed anything/confirm the direction.

macmv · 2025-11-05T22:01:57Z

Thanks! I'm pretty happy with it.

Oh, yes, that's a better name.

I don't quite understand the null fallback for theme. I was suggesting adding a theme key to each rule, which would be something like ["symbol", "type.parameter"]. Each of those keys would get passed to SyntaxTheme::get_opt, and the first matching HighlightStyle is what gets used. Maybe that should be called style_key or something?

Locally, I've got vscode imports wired up for editor.semanticTokenCustomizations. That doesn't cover all the builtin tokens though, as I think we'd like some reasonable defaults. I think the rule matching in vscode is done with a regex, whereas I just opted to match token names by complete strings. Aside from that, the rules I've defined support more features than vscode. For example, these rules allow changing the underline and strikethrough colors.

To make this easy to use, can we just make setting the semantic_tokens field extend the rules from default.json? I worry that we need some sort of default mapping from semantic tokens to highlight themes, with some ways for users to change it. But I don't want users to need to copy that entire mapping over to modify it.

Oh that's good to know. I think the only thing that broke is when the editor was first created, it fetches semantic tokens now, so I had to change the tests slightly.

Yup, I'll add it to the markdown docs. I just forgot to add those.

SomeoneToIgnore · 2025-11-05T22:10:50Z

Sorry, no good knowledge on themes.
Overall, whatever allows minimal changes for both user and development seems great, so the approach seems rightly aligned.

…ontains_partial_buffer_range

…stomHighlights

…to styled ranges

cla-bot bot added the cla-signed The user has signed the Contributor License Agreement label Oct 5, 2025

macmv force-pushed the semantic-tokens branch from 57e88b4 to e99d5f9 Compare October 5, 2025 05:42

macmv marked this pull request as ready for review October 5, 2025 05:42

macmv force-pushed the semantic-tokens branch from e99d5f9 to 1e58fe7 Compare October 5, 2025 08:46

SomeoneToIgnore assigned Veykril and SomeoneToIgnore Oct 6, 2025

This was referenced Oct 22, 2025

Slavpetroff/semantic highlighting rainbow #40939

Closed

Semantic highlighting rainbow #40941

Closed

Semantic highlighting rainbow #40995

Closed

SomeoneToIgnore reviewed Oct 23, 2025

View reviewed changes

SomeoneToIgnore mentioned this pull request Oct 23, 2025

Add semantic syntax highlighting support #27556

Closed

23 tasks

SomeoneToIgnore reviewed Oct 27, 2025

View reviewed changes

macmv force-pushed the semantic-tokens branch from 1e58fe7 to 64d782e Compare November 1, 2025 22:56

SomeoneToIgnore reviewed Nov 4, 2025

View reviewed changes

macmv force-pushed the semantic-tokens branch from 64d782e to 40a795a Compare November 5, 2025 05:43

macmv added 21 commits November 5, 2025 16:42

Add MultiBufferSnapshot::excerpts_for_range and MultiBufferExcerpt::c…

179cb53

…ontains_partial_buffer_range

Add semantic token protobufs

b6bf471

Add an LSP command to request semantic tokens

06b114c

Add a file to manage semantic tokens

c8efba6

Add semantic tokens to LspStore

fd6f3e4

Plumb semantic tokens through SemanticsProvider

04fff44

Store semantic tokens in the display map, and wire that through to Cu…

b487c63

…stomHighlights

Add SyntaxTheme::get_opt

095f8ad

Add SemanticTokenStylizer, which converts tokens to HighligtStyles

7298aa0

Implement PositionedSemanticTokens, which converts semantic tokens in…

00d2b1d

…to styled ranges

Update semantic tokens whenever an editor is modified

8c0c455

Insert highlights in the custom highlight endpoints

f109a6b

Add tests

0b5193f

Add a toggle for semantic tokens in the debug tree

7f7cb64

Add semantic tokens to the debug syntax tree

8c7c2c8

Add a config to toggle semantic tokens

6b1b2b3

Update semantic tokens in all the other places

8d9c9e5

Handle the semanticTokens/refresh request

74e0e08

Debounce lsp store update

51335ee

Add semantic token rules to the config

9ea8949

Document semantic tokens

56e7757

macmv force-pushed the semantic-tokens branch from 5903a19 to 56e7757 Compare November 6, 2025 00:42

	// Use LSP tasks over Zed language extension ones.
	// If no LSP tasks are returned due to error/timeout or regular execution,
	// Zed language extension tasks will be used instead.
	//
	// Other Zed tasks will still be shown:
	// * Zed task from either of the task config file
	// * Zed task from history (e.g. one-off task was spawned before)
	//
	// Default: true
	"prefer_lsp": true

	/// Gets the most recent LSP data for the given buffer: if the data is absent or out of date,
	/// new [`BufferLspData`] will be created to replace the previous state.
	pub fn latest_lsp_data(&mut self, buffer: &Entity<Buffer>, cx: &mut App) -> &mut BufferLspData {
	let (buffer_id, buffer_version) =
	buffer.read_with(cx, \|buffer, _\| (buffer.remote_id(), buffer.version()));
	let lsp_data = self
	.lsp_data
	.entry(buffer_id)
	.or_insert_with(\|\| BufferLspData::new(buffer, cx));
	if buffer_version.changed_since(&lsp_data.buffer_version) {
	*lsp_data = BufferLspData::new(buffer, cx);
	}
	lsp_data
	}

	"textDocument/documentSymbol" => {
	let options = parse_register_capabilities(reg)?;
	server.update_capabilities(\|capabilities\| {
	capabilities.document_symbol_provider = Some(options);
	});
	notify_server_capabilities_updated(&server, cx);
	}

Add semantic tokens #39539

Are you sure you want to change the base?

Add semantic tokens #39539

Conversation

macmv commented Oct 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cla-bot bot commented Oct 5, 2025

Uh oh!

macmv commented Oct 5, 2025

Uh oh!

cla-bot bot commented Oct 5, 2025

Uh oh!

finico commented Oct 8, 2025

Uh oh!

macmv commented Oct 8, 2025

Uh oh!

SomeoneToIgnore left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

macmv commented Oct 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

SomeoneToIgnore left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SomeoneToIgnore commented Oct 29, 2025

Uh oh!

macmv commented Oct 31, 2025

Uh oh!

macmv commented Nov 1, 2025

Uh oh!

reflectronic commented Nov 4, 2025

Uh oh!

macmv commented Oct 5, 2025 •

edited

Loading

SomeoneToIgnore left a comment •

edited

Loading

macmv commented Oct 26, 2025 •

edited

Loading

macmv commented Nov 5, 2025 •

edited

Loading

macmv commented Nov 5, 2025 •

edited

Loading

macmv commented Nov 5, 2025 •

edited

Loading