Conversation
🦋 Changeset detectedLatest commit: 126b1ff The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
|
Related Documentation 2 document(s) may need updating based on files changed in this PR: Read Frog - Open Source Immersive Translate Auto Translation FeaturesView Suggested Changes@@ -202,7 +202,6 @@
This approach ensures that translation works for all iframe content, including complex sites like edX, embedded widgets, and dynamically inserted frames.
### How Shadow DOM translation and style injection works
-
Read Frog now fully supports translating and styling content inside Shadow DOMs, such as those used by custom web components. Previously, translation styles were only injected at the document level, which meant translated content inside Shadow Roots was not styled correctly due to style encapsulation.
**Key features of Shadow DOM support:**
@@ -229,7 +228,7 @@
- Exclusion of code listings and custom elements from translation traversal.
- Exclusion of Reddit-specific UI/accessibility elements.
- **Exclusion of decorative and non-content elements (SVG, SCRIPT, STYLE, etc.):** Elements that are not meant to be translated, such as SVG graphics, script tags, and style tags, are now filtered out during translation traversal. This prevents visual artifacts (such as stray dots) and ensures only meaningful content is translated.
-- **Exclusion of elements with `aria-hidden="true"`:** Elements marked as aria-hidden are now skipped during translation traversal. This prevents hidden or accessibility-only elements from affecting block/inline determination and ensures only visible content is translated. This fix resolves layout issues on sites like Twitter where aria-hidden elements previously caused incorrect translation grouping.
+- **Exclusion of elements with `aria-hidden="true"`, `sr-only`, or `visually-hidden` classes:** Elements marked as aria-hidden or with classes commonly used to visually hide content (such as `sr-only` and `visually-hidden`) are now skipped during translation traversal. This prevents hidden or accessibility-only elements from affecting block/inline determination and ensures only visible content is translated. This fix resolves layout issues on sites like Twitter where aria-hidden elements previously caused incorrect translation grouping, and also prevents translation of screen-reader-only or visually hidden text.
- **Exclusion of YouTube-specific UI elements and metadata:** Navigation bars, masthead, guide, metadata panels, channel name, comments header, reply/more buttons, badges, and other non-content selectors on www.youtube.com are now excluded from translation. This ensures only real content is translated and prevents broken labels. The following selectors are used for exclusion:
- `#masthead-container *`, `#guide-inner-content *`, `#metadata *`, `#channel-name`, `.translate-button`, `.yt-lockup-metadata-view-model__metadata`, `.yt-spec-avatar-shape__badge-text`, `.shortsLockupViewModelHostOutsideMetadataSubhead`, `ytd-comments-header-renderer`, `#top-row`, `#header-author`, `#reply-button-end`, `#more-replies`, `#info`, `#badges *`
- Only relevant content is translated, not entire ancestor nodes.Translation Toggle Logic and Content DetectionView Suggested Changes@@ -12,6 +12,9 @@
To find the wrapper for a translated node, `findPreviousTranslatedWrapper(node: Element | Text, walkId: string)` checks if the node itself is a wrapper (with a different walkId) or looks for a wrapper as a child that doesn't match the current walkId. The wrapper element always includes a `data-read-frog-translation-mode` attribute indicating the mode (`bilingual` or `translationOnly`) and a `data-read-frog-walked` attribute for walk tracking.
The system uses additional DOM attributes and classes to label nodes during traversal, such as `WALKED_ATTRIBUTE`, `BLOCK_ATTRIBUTE`, `INLINE_ATTRIBUTE`, and `PARAGRAPH_ATTRIBUTE`, which help distinguish between block and inline nodes and manage translation state.
+
+**Exclusion of Hidden Elements:**
+Elements that are visually hidden or marked as not intended for user visibility are now explicitly excluded from translation and text extraction. This includes elements with the `sr-only` or `visually-hidden` classes, elements with `aria-hidden="true"`, elements with `display: none` or `visibility: hidden` styles, and certain tags like `<script>`. These elements are detected using the `isDontWalkIntoAndDontTranslateAsChildElement` function and are skipped during both translation and text extraction. This ensures that only visible, user-facing content is translated.
### Differentiation Between Original and Translated Text
- **Bilingual mode**: Original text remains in the DOM. Translated text is wrapped in a `<span>` element with the `NOTRANSLATE_CLASS` and `CONTENT_WRAPPER_CLASS` classes, and the `data-read-frog-translation-mode="bilingual"` attribute. Inside this wrapper, the translated content itself is further classified as inline or block using `INLINE_CONTENT_CLASS` or `BLOCK_CONTENT_CLASS`.
@@ -43,6 +46,9 @@
### Test Coverage
Unit and integration tests cover the detection logic for translated content nodes and wrappers, as well as the new translation-only mode and complex DOM structures. The `isTranslatedContentNode` function is tested to ensure it returns true for elements with the correct classes and false otherwise, including for text nodes. The `findPreviousTranslatedWrapper` function is tested to verify it finds the correct wrapper for translated nodes, returns null for non-translated content, and correctly traverses multiple parent levels. The translation-only mode is tested to ensure that original content is restored correctly when toggled off, and that the wrapper and attributes are set as expected. Integration tests now cover mixed inline/block/nested structures and switching between translation modes. Numeric content exclusion and `<pre>` tag exclusion are also covered to ensure numbers and preformatted/code blocks are not translated or wrapped.
+**Hidden Element Exclusion Tests:**
+Additional unit tests verify that elements with the `sr-only` or `visually-hidden` classes, elements with `aria-hidden="true"`, and similar hidden elements are excluded from translation and text extraction. This ensures that only visible content is processed, preventing hidden or accessibility-only content from being translated or included in extracted text.
+
### User Experience Improvements
A translation mode selector is now available in the popup UI, allowing users to easily switch between bilingual and translation-only modes. The selector uses localized labels and provides tooltips for additional guidance. If a user selects a mode that is not supported by the current provider, the system automatically switches to a compatible provider, ensuring seamless operation and reducing the likelihood of errors. This smart fallback mechanism improves reliability and user confidence in the translation feature.
Note: You must be authenticated to accept/decline updates. |
PR Review: fix(extension): fix hidden elements being translatedSummaryThis PR correctly moves The logic change is sound: Code QualityGood:
Suggestions:
Potential Bugs / Issues
SecurityNo security concerns. The changes are purely about DOM filtering logic. Test Coverage
VerdictClean, well-scoped fix. The core logic change (moving sr-only from "translate as child" to "don't translate") is correct. The performance suggestions (caching LGTM with minor suggestions. |
PR Review: fix(extension): fix hidden elements being translatedOverviewMoves Fixes #927 (OpenAI blog hidden text being translated). Code QualityStrengths:
Issues & Suggestions1. Redundant const dontWalkCSS
= window.getComputedStyle(element).display === 'none'
|| window.getComputedStyle(element).visibility === 'hidden'
const style = window.getComputedStyle(element)
const dontWalkCSS = style.display === 'none' || style.visibility === 'hidden'This matters because 2. Eager evaluation defeats short-circuit optimization ( All six checks are evaluated upfront before the if (DONT_WALK_AND_TRANSLATE_TAGS.has(element.tagName)) return true
if (element.getAttribute('aria-hidden') === 'true') return true
if (['sr-only', 'visually-hidden'].some(cls => element.classList.contains(cls))) return true
// ... then the expensive checksNot a blocker, but a meaningful perf improvement for a hot path. 3. Class-name-based detection is inherently fragile
This is a known limitation, not a blocker. A brief code comment noting the tradeoff would be helpful for future contributors. 4. Missing test edge cases
SecurityNo concerns. Changes are purely DOM filtering logic with no external input handling. VerdictThe core logic change is correct and well-scoped. The suggestions above (caching LGTM with minor suggestions. |
Type of Changes
Description
Related Issue
Closes #927
How Has This Been Tested?
Screenshots
Checklist
Additional Information
Summary by cubic
Prevents the extension from translating hidden content. Screen-reader-only and visually hidden text is now excluded from traversal, translation, and text extraction.
Written for commit 126b1ff. Summary will update on new commits.