Skip to content

Conversation

@Mr0grog
Copy link
Member

@Mr0grog Mr0grog commented Jun 17, 2025

I realized yesterday that we are probably not surfacing some important changes to Spanish pages (now that we are tracking a handful) because we don’t have a Spanish version of any of our key terms. This adds a few (we should probably just translate them all).

Doing so revealed that we were not necessarily handling accents very well or very reliably, so I’ve also added code to:

  1. Always compare a Unicode normalized form (NFKC in this case).
  2. Remove accents from the text for comparison purposes. I think this is probably the right thing (much looser matching), but we could drop this part. It’s not as critical as the unicode normalization.

Mr0grog added 2 commits June 17, 2025 10:23
We should always have been doing this, but the problem is a lot more obvious when it comes to Spanish text. We at least need to choose a specific unicode normalization for comparisons (I've gone with NFKC, but NFKD could also work; I think it needs to be NFK-something, though). Additionally, I've gone ahead and simply removed accents altogether here, which I think is probably good for comparisons (obviously bad if we were ever going to display the resulting text).
@Mr0grog Mr0grog merged commit 8e1ff79 into main Jun 17, 2025
3 checks passed
@Mr0grog Mr0grog deleted the spanish-key-terms-are-not-english branch June 17, 2025 17:34
@github-project-automation github-project-automation bot moved this from Inbox to Done in Web Monitoring Jun 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

2 participants