Add server component for batched alignment calls by robinp · Pull Request #25 · robertostling/eflomal

robinp · 2025-07-15T19:21:05Z

Below commits add an eflomal-server binary, that can be started after activating the virtualenv (or after packaging and installing). It by default looks for a server_config.json file that describes a list of aligners, as determined by their prior files.

Then a JSON call can be made (see the shell script in devscripts directory for example) to a specified aligner, with one or more sentence pair passed. A sentence can either be a string, or a list of tokens.

The endpoint also takes optional parameters, for example scoring can be disabled; model updates can be disabled; iterations per model can be specified.

The server mode preloads and preprocesses the priors, so subsequent API calls are reasonably fast. The calls still operate as eflomal binary executions, but the exec overhead is not significant compared to the alignment computation itself. NULL priors are not supported, but comments are left where they could be added.

Other notable changes:

In eflomal C binary, allow to treat zero (that is, no) sentences as clean for model update. So the previous zero-default that meant all as clean is now -1, and zero really means zero.
eflomal C binary debug printouts also include which pass is it (forward/reverse)
eflomal-align can skip N input lines, also limit to processing M input lines, in order to allow aligning windows from a larger input file.
bump python to >=3.12, probably for the NamedTemporaryFile safety (new param available from there).
some clarifying comments

Bugfix (?): calculate_priors in reverse mode didn't reverse the pairs, fixed that.

Not sure, but sounds logical.

Passing trust_sents=False will set n_clean=0, which (after this change) means no sentences are trusted for statistics, so prior updates don't happen. Useful for batched sending of sentences of dubious quality.

Robin Palotai added 8 commits May 12, 2025 11:05

Small comments and diagnosis.

12668a0

Add more config params to server call.

8974e0b

Distinguish forward/reverse pass in logs.

a46c127

Add options to skip/limit processed lines.

8dc4c5c

Fix reverse prior LEX direction?

1c3ddae

Not sure, but sounds logical.

Server option trust_sents to disable n_clean.

ed23970

Passing trust_sents=False will set n_clean=0, which (after this change) means no sentences are trusted for statistics, so prior updates don't happen. Useful for batched sending of sentences of dubious quality.

Support returning scores through server.

ea15c41

Add small test server data example.

a77b95e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add server component for batched alignment calls#25

Add server component for batched alignment calls#25
robinp wants to merge 8 commits intorobertostling:masterfrom
robinp:server

robinp commented Jul 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

robinp commented Jul 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant