-
Notifications
You must be signed in to change notification settings - Fork 283
Cache key_value? lookups with pre-built flat Set per locale #709
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Replace per-call tree traversal in key_value? with a lazily-built Set of all node keys per locale. Invalidates on data.reload. Reduces `i18n-tasks missing` runtime by ~19% on large projects (145s → 118s with 31 locales, 91 locale files).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR optimizes missing-keys computation by replacing repeated per-call tree lookups in key_value? with a lazily-built, per-locale Set of keys, and introduces cache invalidation tied to data.reload.
Changes:
- Wrap the data adapter’s
reloadto invalidate a newkey_value?cache. - Implement
key_value?as an O(1) membership check against a cached per-localeSet. - Add cache builder (
build_key_value_set) and invalidation helper (invalidate_key_value_cache!).
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| def build_key_value_set(locale) | ||
| keys = Set.new | ||
| data[locale].nodes { |node| keys << node.full_key(root: false) unless node.key.nil? } | ||
| keys |
Copilot
AI
Feb 11, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
key_value? now checks membership in a Set of node keys, which changes semantics for leaf nodes with value == nil. Previously key_value? returned false in that case because t(key, locale) returns nil when the stored translation value is nil (see Node#value_or_children_hash). With the current implementation, those keys will be treated as present, which can hide missing translations. Consider only adding keys for non-leaf nodes, or for leaf nodes where node.value is not nil (i.e., mimic t(...).nil? behavior).
| def key_value?(key, locale = base_locale) | ||
| !t(key, locale).nil? | ||
| @key_value_cache ||= {} | ||
| locale_cache = (@key_value_cache[locale] ||= build_key_value_set(locale)) | ||
| locale_cache.include?(key) |
Copilot
AI
Feb 11, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
key_value? previously accepted Symbol keys (they were coerced via Siblings#get(full_key.to_s)), but locale_cache.include?(key) will not match because the Set is populated with Strings. To preserve existing behavior, coerce key to a String before lookup (and ideally do the same normalization when building the set).
| # whether the value for key exists in locale (defaults: base_locale) | ||
| def key_value?(key, locale = base_locale) | ||
| !t(key, locale).nil? | ||
| @key_value_cache ||= {} |
Copilot
AI
Feb 11, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cache is keyed by locale as passed in. Elsewhere the data layer normalizes locales to strings (FileSystemBase#get(locale) calls locale.to_s), so calling key_value? with both :en and "en" will build duplicate caches and waste memory. Consider normalizing locale = locale.to_s before indexing @key_value_cache and before calling build_key_value_set.
| @key_value_cache ||= {} | |
| @key_value_cache ||= {} | |
| locale = locale.to_s |
| adapter.define_singleton_method(:reload) do | ||
| task.invalidate_key_value_cache! | ||
| super() |
Copilot
AI
Feb 11, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wrapping adapter.reload via define_singleton_method and calling super() assumes every configured data adapter implements reload with a compatible signature. Since data.adapter is configurable, this can raise NoMethodError (no reload) or break if an adapter defines reload(*args) later. Consider guarding with respond_to?(:reload) and forwarding args/kwargs/block (super(*args, **kwargs, &block)), or wrapping via Module#prepend to avoid signature mismatches.
| adapter.define_singleton_method(:reload) do | |
| task.invalidate_key_value_cache! | |
| super() | |
| if adapter.respond_to?(:reload) | |
| adapter.define_singleton_method(:reload) do |*args, **kwargs, &block| | |
| task.invalidate_key_value_cache! | |
| super(*args, **kwargs, &block) | |
| end |
Summary
key_value?with a lazily-builtSetof all node keys per localedata.reloadi18n-tasks missingruntime by ~19% on large projects (145s → 118s with 31 locales, 91 locale files)Problem
key_value?is called thousands of times per locale pair duringmissing_keyscomputation. Each call does a full tree lookup viat(key, locale)→data.t→Siblings#get→ recursivesplit_keynavigation. For projects with many locales, this dominates Phase 4 (compute missing keys) which accounts for ~73% of total runtime.Solution
Pre-build a flat
Set<String>of all node keys per locale on firstkey_value?call. Subsequent lookups are O(1) hash membership checks instead of O(tree_depth) traversals. The cache is invalidated whendata.reloadis called.Test plan
remove_unusedspecs pass — cache correctly invalidates ondata.reloadi18n-tasks missingproduces identical results (0 false positives/negatives)