numbers in spanish and catalan over 100 + small fixes #31

c-armentano · 2025-08-21T07:39:37Z

Summary by CodeRabbit

New Features
- Catalan: Support for both long/short large-number scales, optional scientific-style wording, infinity phrases, and expanded readable large-number and decimal pronunciations.
- Spanish: Option to choose long or short large-number scales, expanded magnitude names, improved large-number rendering and consistent decimal reading.
Bug Fixes
- Catalan: Improved pluralization and cleaned inconsistent mixed-fraction phrasing.
- Spanish: Fixed spelling for seventeen and standardized plural forms for large-number terms.

coderabbitai · 2025-08-21T07:39:45Z

Caution

Review failed

The pull request is closed.

Walkthrough

Adds long- and short-scale dictionaries and rewrites Catalan and Spanish pronunciation logic to support both scales, updated pluralization/morphology, decimal and infinity handling, and changes function signatures to accept scale flags (and scientific for Catalan).

Changes

Cohort / File(s)	Summary
Catalan scale & pronunciation `ovos_number_parser/numbers_ca.py`	Added `_LONG_SCALE_CA` and `_SHORT_SCALE_CA` OrderedDicts; updated `pronounce_number_ca` signature to `(number, places=2, short_scale=False, scientific=False)`; implemented infinity handling, helpers (`_sub_thousand`, `_split_by`, `_short_scale`, `_long_scale`), pluralization/ending normalization, decimal ("coma") digit pronunciation, and scale-based composition.
Spanish scale & pronunciation `ovos_number_parser/numbers_es.py`	Added/rewrote `_LONG_SCALE_ES` and `_SHORT_SCALE_ES`; updated `pronounce_number_es` signature to `(number, places=2, short_scale=False)`; implemented decomposition-based pronunciation with `_sub_thousand`, `_short_scale`, `_long_scale`, morphological adjustments, corrected lexical items, and preserved decimal handling with scale-aware formatting.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Caller
  participant Pronounce as pronounce_number_ca/es
  participant ScaleBuilder as scale selection
  participant SubThousand as _sub_thousand
  participant ShortScale as _short_scale
  participant LongScale as _long_scale

  Caller->>Pronounce: number, places[, short_scale, scientific]
  Pronounce->>ScaleBuilder: build number_names using chosen scale
  alt exact match found
    ScaleBuilder-->>Pronounce: direct word
    Pronounce-->>Caller: word
  else compose needed
    alt short_scale == true
      Pronounce->>ShortScale: split by 1000s
      ShortScale->>SubThousand: render chunk(s)
      ShortScale-->>Pronounce: joined phrase
    else
      Pronounce->>LongScale: split and recurse by large magnitudes
      LongScale->>Pronounce: recursive chunk renders
      LongScale-->>Pronounce: joined phrase
    end
    opt decimal or infinity
      Pronounce->>Pronounce: append "coma" digits or "infinit"/"menys infinit"
    end
    Pronounce-->>Caller: final phrase
  end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

I hop through scales with a twitch and a cheer,
From mil to trillón, I count far and near.
Coma and decimals, I nibble with grace,
Short or long ladders—I bound every place.
Infinity winks; more clover to chase. 🐇✨

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 382d155 and 05f14ba.

📒 Files selected for processing (2)

ovos_number_parser/numbers_ca.py (3 hunks)
ovos_number_parser/numbers_es.py (4 hunks)

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (6)

ovos_number_parser/numbers_ca.py (3)
392-395: Simplify boolean comparison and dictionary key checks

The code can be made more Pythonic and cleaner.

Apply this diff to improve code style:
-    if short_scale==True:
-        hundreds = [_SHORT_SCALE_CA[n] for n in _SHORT_SCALE_CA.keys()]
+    if short_scale:
+        hundreds = [_SHORT_SCALE_CA[n] for n in _SHORT_SCALE_CA]
     else:
-        hundreds = [_LONG_SCALE_CA[n] for n in _LONG_SCALE_CA.keys()]
+        hundreds = [_LONG_SCALE_CA[n] for n in _LONG_SCALE_CA]
419-422: Consider using ternary operator for better readability

The if-else block for setting _partial can be simplified.

Apply this diff to use a ternary operator:
-                if q == 1:
-                    _partial = "cent"
-                else:
-                    _partial = digits[q] + "-cents"
+                _partial = "cent" if q == 1 else digits[q] + "-cents"
478-493: Complex morphological adjustments for large numbers

The code handles singular/plural forms and morphological changes for large Catalan numbers. The logic appears correct but could benefit from documentation explaining these linguistic rules.

Consider adding a comment block explaining the morphological rules being applied here, as they are language-specific and might not be obvious to future maintainers:
+    # Catalan morphological adjustments for large numbers:
+    # - Numbers ending in "rds" get "un" prefix and drop final "s" for singular
+    # - Numbers ending in "ons" get "un" prefix and change to "ó" ending for singular
     big_nums = [_LONG_SCALE_CA[a] for a in _LONG_SCALE_CA]
ovos_number_parser/numbers_es.py (3)
11-11: Trailing whitespace after 'uno'

Minor formatting issue with trailing space.

Apply this diff to remove the trailing whitespace:
-    1: 'uno', 
+    1: 'uno',
640-643: Simplify boolean comparison and dictionary iteration

Similar style improvements as suggested for the Catalan file.

Apply this diff to improve code style:
-    if short_scale==True:
-        hundreds = [_SHORT_SCALE_ES[n] for n in _SHORT_SCALE_ES.keys()]
+    if short_scale:
+        hundreds = [_SHORT_SCALE_ES[n] for n in _SHORT_SCALE_ES]
     else:
-        hundreds = [_LONG_SCALE_ES[n] for n in _LONG_SCALE_ES.keys()]
+        hundreds = [_LONG_SCALE_ES[n] for n in _LONG_SCALE_ES]
753-756: Consider combining nested if statements

The nested if statements can be combined for better readability.

Apply this diff to combine the conditions:
     big_nums = [_LONG_SCALE_ES[a] for a in _LONG_SCALE_ES]
-    if result in big_nums:
-        
-        if result[-4:] == "rdos" or result[-4:] == "ones":
-            result = "un " + result[:-1]
+    if result in big_nums and (result[-4:] == "rdos" or result[-4:] == "ones"):
+        result = "un " + result[:-1]

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 0fe22f7 and 382d155.

📒 Files selected for processing (2)

ovos_number_parser/numbers_ca.py (3 hunks)
ovos_number_parser/numbers_es.py (4 hunks)

🧰 Additional context used

🧬 Code graph analysis (2)

ovos_number_parser/numbers_es.py (2)

ovos_number_parser/numbers_ca.py (4)

_sub_thousand (402-425)

_short_scale (427-444)

_split_by (446-452)

_long_scale (454-471)

ovos_number_parser/numbers_it.py (4)

_sub_thousand (819-841)

_short_scale (843-858)

_split_by (860-866)

_long_scale (868-885)

ovos_number_parser/numbers_ca.py (1)

ovos_number_parser/numbers_es.py (4)

_sub_thousand (650-688)

_short_scale (697-715)

_split_by (717-723)

_long_scale (725-745)

🪛 Ruff (0.12.2)

ovos_number_parser/numbers_es.py

640-640: Avoid equality comparisons to True; use short_scale: for truth checks

Replace with short_scale

(E712)

641-641: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)

643-643: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)

701-701: Yoda condition detected

Rewrite as n >= 0

(SIM300)

718-718: Yoda condition detected

Rewrite as n >= 0

(SIM300)

729-729: Yoda condition detected

Rewrite as n >= 0

(SIM300)

753-755: Use a single if statement instead of nested if statements

Combine if statements using and

(SIM102)

ovos_number_parser/numbers_ca.py

392-392: Avoid equality comparisons to True; use short_scale: for truth checks

Replace with short_scale

(E712)

393-393: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)

395-395: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)

419-422: Use ternary operator _partial = "cent" if q == 1 else digits[q] + "-cents" instead of if-else-block

Replace if-else-block with _partial = "cent" if q == 1 else digits[q] + "-cents"

(SIM108)

431-431: Yoda condition detected

Rewrite as n >= 0

(SIM300)

447-447: Yoda condition detected

Rewrite as n >= 0

(SIM300)

458-458: Yoda condition detected

Rewrite as n >= 0

(SIM300)

🔇 Additional comments (9)

ovos_number_parser/numbers_ca.py (5)

345-356: Clean implementation of pluralization logic

The revised pluralization rule correctly handles Catalan morphology for words ending with "è", converting them to "ens" for plural forms, and applying standard "s" suffix otherwise.

360-360: Function signature updated to support short and long scale pronunciations

The addition of short_scale and scientific parameters enables proper handling of different number scaling conventions. Good API design choice.

376-379: Proper handling of infinity values

Good addition of support for positive and negative infinity values with appropriate Catalan translations.

402-425: Well-structured helper function for numbers 0-999

The _sub_thousand function correctly handles Catalan number formation rules, including the special case for "vint-i-" pattern and proper hundred forms.

463-463: No action required: _long_scale recursive calls intentionally use short_scale=True

I’ve verified the pattern across all language modules (numbers_ca.py, numbers_az.py, numbers_cs.py, numbers_en.py, numbers_it.py, numbers_ru.py, etc.): each _long_scale implementation invokes its pronounce_number_* helper with the second boolean argument hard-coded to True, ensuring that sub-thousand segments are rendered via the short-scale logic within a long-scale context. This is consistent and appears to be the intended design—you can resolve this comment.

ovos_number_parser/numbers_es.py (4)

611-611: Well-designed API change for scale support

The addition of the short_scale parameter enables proper support for both numbering conventions used in Spanish-speaking regions.

650-688: Excellent implementation of Spanish number formation rules

The _sub_thousand function correctly handles all the special cases in Spanish number pronunciation, including:

Special forms for 22, 23, 26 in the twenties

Irregular hundreds (100, 500, 700, 900)

Proper concatenation with "y" for compound numbers

690-695: Proper handling of "uno/un" agreement

The _un_uno helper function correctly handles the masculine singular form adjustments required in Spanish when numbers precede scale words.

734-734: Verification complete: recursive calls consistently use short_scale=True across implementations

I’ve confirmed that in the long-scale routines for Spanish, Catalan, and Italian, the recursive calls deliberately pass True for the short_scale flag to format sub-million segments using short-scale naming:

• In numbers_es.py (line 734):
number = pronounce_number_es(z, places, True)

• In numbers_ca.py (line 463):
number = pronounce_number_ca(z, places, True, scientific)

• In numbers_it.py (line 877):
number = pronounce_number_it(z, places, True, scientific)

This matches the pattern in the Catalan and Italian implementations, confirming the design choice is intentional and consistent. No changes needed.

ovos_number_parser/numbers_ca.py

ovos_number_parser/numbers_es.py

coderabbitai · 2025-08-22T19:01:40Z

Note

Unit test generation is an Early Access feature. Expect some limitations and changes as we gather feedback and continue to improve it.

Generating unit tests... This may take up to 20 minutes.

coderabbitai · 2025-08-22T19:18:40Z

✅ UTG Post-Process Complete

No new issues were detected in the generated code and all check runs have completed. The unit test generation process has completed successfully.

coderabbitai · 2025-08-22T19:18:43Z

Creating a PR to put the unit tests in...

The changes have been created in this pull request: View PR

JarbasAl · 2025-10-04T14:41:25Z

@c-armentano can you check the 2 comments from the review bot and also add a couple unittests?

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

numbers in spanish and catalan over 100 + small fixes

382d155

coderabbitai bot reviewed Aug 21, 2025

View reviewed changes

ovos_number_parser/numbers_ca.py Outdated Show resolved Hide resolved

ovos_number_parser/numbers_ca.py Outdated Show resolved Hide resolved

ovos_number_parser/numbers_es.py Outdated Show resolved Hide resolved

coderabbitai bot mentioned this pull request Aug 22, 2025

CodeRabbit Generated Unit Tests: Add tests for Catalan and Spanish number parsers #32

Open

JarbasAl and others added 3 commits October 12, 2025 17:56

Update ovos_number_parser/numbers_ca.py

4e5bece

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

Update ovos_number_parser/numbers_es.py

62014fc

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

Update ovos_number_parser/numbers_ca.py

05f14ba

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

JarbasAl merged commit ca4dfb7 into OpenVoiceOS:dev Oct 12, 2025
3 of 4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

numbers in spanish and catalan over 100 + small fixes #31

numbers in spanish and catalan over 100 + small fixes #31

Uh oh!

c-armentano commented Aug 21, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Aug 21, 2025 •

edited

Loading

Review failed

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot commented Aug 22, 2025

Uh oh!

coderabbitai bot commented Aug 22, 2025

Uh oh!

coderabbitai bot commented Aug 22, 2025

Uh oh!

JarbasAl commented Oct 4, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

numbers in spanish and catalan over 100 + small fixes #31

numbers in spanish and catalan over 100 + small fixes #31

Uh oh!

Conversation

c-armentano commented Aug 21, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Aug 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot commented Aug 22, 2025

Uh oh!

coderabbitai bot commented Aug 22, 2025

Uh oh!

coderabbitai bot commented Aug 22, 2025

Uh oh!

JarbasAl commented Oct 4, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

c-armentano commented Aug 21, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Aug 21, 2025 •

edited

Loading