fix: enforce backend consistency and expand tests by adrianardv · Pull Request #282 · team-decent/decent-bench

adrianardv · 2026-04-01T08:28:17Z

Summary

This PR adds a set of small enhancements and test coverage improvements across costs, networks, and federated behavior.

The main functional change is to enforce backend consistency more strictly:

disallow networks whose agents use costs with different shape, framework, or device
disallow composing costs with different framework or device

This addresses the main issue in #275. The only remaining point discussed there is globally setting framework/device through IOP.

What changed

enforce cost compatibility checks in network construction
enforce framework/device compatibility in generic cost composition and SumCost
clarify FedAvg local update routing so empirical-risk-preserving costs keep minibatch behavior while generic wrapped costs use full gradients
document PyTorchCost + built-in regularizer
extend regression coverage for costs, cost operators, networks, and federated behavior

Closes #275

Route FedAvg local updates through the preserved cost abstraction so empirical costs and empirical wrappers keep mini-batch behavior while generic costs, regularizers, and zero costs continue to use full gradients. Add FedAvg-specific batching tests into a dedicated module.

Document how to combine PyTorchCost with built-in regularizers using matching framework and device settings. Add tests that verify empirical behavior is preserved for compatible PyTorch objectives and that mismatched NumPy regularizers raise an error.

Move server-to-client synchronization into FedAlgorithm and rename FedAvg local update helpers to reflect their batching and full-gradient roles more clearly. Add focused federated routing and aggregation tests to cover the preserved FedAvg behavior with composed costs.

Add focused tests for concrete cost implementations and network communication behavior. Cover invalid graphs, inactive receivers, message buffer lifecycle, and scheme integration, and add direct checks for linear, logistic, quadratic, zero, and PyTorch costs.

…sitions Validate agent's cost shape, framework, and device when constructing networks so mixed-backend configurations fail early. Validate cost composition on framework and device by default and enforce the same invariant in SumCost construction so mixed-backedn composite costs are rejected consistently. Add regression tests covering mixed framework/device rejection for networks and generic cost composition.

nicola-bastianello · 2026-04-01T09:00:02Z

looks good, thank you! I'll review the details soon

one idea I had: currently the checks are made using Cost's method _validate_cost_operation, which only does a pairwise check between self and another cost. but now that we have decided on enforcing consistency, I think this utility could be spun off into a iop utils. it could be called verify_compatible_backends or something like that, and it takes a list of costs to check (when #273 is addressed, this util could then be modified to allow for a single cost also, which is checked against the backend set by set_backend). this way we have a single clear place where we document this backend consistency and easy to maintain

I would say you can set this PR to close #275 , and we address the set_backend with #273

nicola-bastianello

looks good, thanks! just one comment

nicola-bastianello · 2026-04-02T09:30:25Z

decent_bench/distributed_algorithms.py

    def _cleanup_agents(self, network: FedNetwork) -> Iterable["Agent"]:
        return [network.server(), *network.clients()]

+    def _sync_server_to_clients(self, network: FedNetwork, selected_clients: Sequence["Agent"]) -> None:


if we want to make this a method in FedAlgorithm maybe it should be public, so that users implementing new algorithms can benefit from this util. if we want to keep private, then it should probably be in the subclass

if made public, could be renamed like server_broadcast or something like that (for shorter name)

Simpag

Two minor comments, dont have to do anything if you dont want to/have the time, otherwise LGTM.

Simpag · 2026-04-03T17:43:58Z

decent_bench/distributed_algorithms.py

+                batch_indices = indices[start : start + cost.batch_size]
                grad = cost.gradient(local_x, indices=batch_indices)
                local_x -= self.step_size * grad
        return local_x


While this assures that a full epoch through the dataset is performed, I've updated empirical risk cost sampling so that it will iterate through the entire dataset (random order) before it re-uses datapoints. Therefore you could simplify this a bit by removing the indices parameter, but its not a big deal.

One side effect; if you're using PyTorchCost with a dataloader, the indices parameter bypasses the dataloader and gathers data manually. From my experience dataloaders are slower when running on the cpu so its not commonly used but it might slow things down depending on the model and dataset size.

Simpag · 2026-04-03T17:46:56Z

decent_bench/schemes.py

    def compress(self, msg: Array) -> Array:  # noqa: D102
-        res = np.vectorize(lambda x: float(f"%.{self.n_significant_digits - 1}e" % x))(iop.to_numpy(msg))  # noqa: RUF073
+        res = np.vectorize(lambda x: float(format(x, f".{self.n_significant_digits - 1}e")))(iop.to_numpy(msg))
        return iop.to_array_like(res, msg)



I feel like there has to be a more efficient way of performing quantization than doing to_numpy -> float -> string -> float -> back to framework. If you feel like you have the time please check if there are any better ways of doing this, otherwise I'll just create an issue of this at some point no problem.

Update: You dont have to worry about this. This is insanely inefficient, I have made an update to this and will include it in my bigger update within 1-2 weeks. Some simple math made this at least 10x more efficient

adrianardv added 7 commits March 31, 2026 00:16

Merge branch 'main' into enh/fedavg

d69f76d

test(networks): Cover shape and server backend mismatches

33f8c02

adrianardv requested review from elramen and nicola-bastianello as code owners April 1, 2026 08:28

adrianardv added 3 commits April 1, 2026 11:06

fix(mypy): remove stale ignores and narrow cost construction types

11330fc

fix(mypy)

5fc560f

fix(ruff)

e25a5c0

nicola-bastianello reviewed Apr 2, 2026

View reviewed changes

Simpag reviewed Apr 3, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: enforce backend consistency and expand tests#282

fix: enforce backend consistency and expand tests#282
adrianardv wants to merge 10 commits intoteam-decent:mainfrom
adrianardv:enh/fedavg

adrianardv commented Apr 1, 2026 •

edited

Loading

Uh oh!

nicola-bastianello commented Apr 1, 2026

Uh oh!

nicola-bastianello left a comment

Uh oh!

nicola-bastianello Apr 2, 2026

Uh oh!

Simpag left a comment

Uh oh!

Simpag Apr 3, 2026

Uh oh!

Simpag Apr 3, 2026

Uh oh!

Simpag Apr 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

adrianardv commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What changed

Uh oh!

nicola-bastianello commented Apr 1, 2026

Uh oh!

nicola-bastianello left a comment

Choose a reason for hiding this comment

Uh oh!

nicola-bastianello Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

Simpag left a comment

Choose a reason for hiding this comment

Uh oh!

Simpag Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Simpag Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Simpag Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

adrianardv commented Apr 1, 2026 •

edited

Loading