Skip to content

Conversation

@nciric
Copy link
Contributor

@nciric nciric commented Apr 30, 2025

This implements dictionary lookup from Serbian lexicon.

Some words fail to properly inflect, even though they should match inflection rules, e.g.:

  1. уранак should inflect by group f (see word пропланак)
  2. игроказ should inflect by group 29 (see word путоказ)
  3. пашњак (maybe special case as it inserts additional a to the suffix)

Those examples are currently commented out.

@nciric nciric marked this pull request as draft April 30, 2025 20:07
@nciric nciric requested a review from grhoten April 30, 2025 20:07
@nciric nciric marked this pull request as ready for review April 30, 2025 21:27
@nciric
Copy link
Contributor Author

nciric commented Apr 30, 2025

George replied offline with:

If the word is not in the dictionary, then you need to build a heuristic for handling such situations. See EnGrammarSynthesizer_EnDisplayFunction::guessPluralInflection for an example way to guess words.

Your current code only works for words in the lexical dictionary, and the code can take some ambiguity into account when doing the inflection table lookup.

@nciric nciric merged commit 53b14ee into main May 1, 2025
3 checks passed
@nciric nciric deleted the cira-sr branch May 1, 2025 16:47
@nciric nciric mentioned this pull request May 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants