Moo ensemble #3

mfeurer · 2022-05-16T11:57:37Z

Hi everybody,

Here's a draft PR on how to pass custom ensemble classes to Auto-sklearn, and how to use multi-objective ensembles. is currently in the private fork because we don't want to make it public yet. As before, I'll add unit tests and doc strings as soon as folks are happy with the API.

Concretely, this PR adds:

A new interface to pass in custom ensemble strategies.
Full support for multi-objective ensemble building.
New functionality to retrieve the identifiers and weights of an ensemble.
Because of that I was able to improve type definitions for the ensemble building submodule. More concretely, there are no more ensemble files exempt from type checking.
Three new ensemble classes:
1. A naive wrapper class that repeatedly calls a single-objective ensemble with a weighted metric function.
2. An efficient implementation of Agnostic Bayesian Learning of Ensembles that can be used instead of the traditional ensemble selection.
3. An efficient multi-objective implementation of ABLE (which would be novel methodology)

Open TODOs:

Weighting the metrics in the case of a generic ensemble
Weighting the metrics in case of multi-objective ABLE
Docstrings
Unit tests

autosklearn/ensemble_building/builder.py

autosklearn/ensembles/abstract_ensemble.py

autosklearn/ensembles/ensemble_selection.py

autosklearn/ensembles/multi_objective_wrapper.py

autosklearn/automl.py

examples/40_advanced/example_multi_objective.py

autosklearn/automl.py

mfeurer · 2022-05-18T11:24:16Z

This PR is ready from my side except for the missing documentation, unit tests and ... weighting. While I know how to do the 1st two, I'm not 100% sure how to do weighting of metrics properly.

The ParEGO paper states:

ParEGO begins by normalizing the � cost functions with respect to the known (or estimated) limits of the cost space, so that each cost function lies in the range [0,1].

but gives no directions on how to estimate such limits. For bounded metrics (accuracy, AUC) we could use the known bounds. However, these might be loose (the ensemble will never observe something that is worse than 0.5 AUC because the ensemble builder will drop that before) or unknown (What's the upper bound of RMSE?).

Here's a potential remedy:

We pass the dummy score to MO-wrapper
We compute the score for each model prior to running the ensemble (or pass it from the ensemble builder)
We use these as minimum and maximum to scale between zero and one

What do you all think? After writing this I am currently considering to pass the scores computed in the ensemble builder to the ensemble method as an additional argument.

* Load single best model as fallback * Update estimators.py * Improve comment in code. * Fix meta-data generation test, potentially improve model loading

eddiebergman

Some minor changes but mostly it looks all good. As you mentioned, some doc and test needed.

The only major discussion point is what to do with the fact the API for specifying Ensembling has now changed. ensemble_size has been removed and I feel like this is a parameter quite a few people will have specified. If we can keep it so that default behaviour is maintained with ensemble_size specified, I would be happy with that.

autosklearn/automl.py

eddiebergman · 2022-05-19T07:29:06Z

autosklearn/automl.py

+                if self._ensemble_class is not None:
                    raise ValueError(
                        "Not starting ensemble builder because there "
                        "is no time left. Try increasing the value "
                        "of time_left_for_this_task."
                    )


I didn't really notice this before but does this mean we can end up where a user calls fit but because of bad time management on autosklearn's part, the user gets a ValueError? It would make more sense to make this a RuntimeError but maybe it's not worth changing at this point.

No, they should get an ensemble with a DummyEstimator.

autosklearn/automl.py

eddiebergman · 2022-05-19T07:45:16Z

autosklearn/estimators.py

+        ensemble_class: Type[AbstractEnsemble] | None = EnsembleSelection,
+        ensemble_kwargs: Dict[str, Any] | None = None,


This is the main API breaking change, I imagine most power users will have ensemble_size specified and maybe we should deprecate it, maintaining old default behavior by inserting it into ensemble_kwargs? We can issue a warning:

# In function defnition ensemble_size: int | None = None # User specified `ensemble_size` explicitly, warn them about deprecation if ensemble_size is not None: warnings.warn(f"`ensemble_size` has been deprecated, please use `ensemble_kwargs = { 'ensemble_size': {ensemble_size} }. Inserting `ensemble_size` into `ensemble_kwargs` for now.") # Keep consistent behaviour if ensemble_kwargs is None: ensemble_kwargs = {"ensemble_size": ensemble_size} else: ensemble_kwargs["ensemble_size"] = ensemble_size else: # Old default behaviour, no need to warn here as they havn't set `ensemble_size` if ensemble_kwargs is None: ensemble_kwargs = {"ensemble_size": 50}

Porbably not the cleanest solution but hopefully it gets the point across.

As mentioned above I'll deprecate this. I'll deprecate this here so that automl.py will only accept ensemle_kwargs

Updated this.

examples/40_advanced/example_custom_ensemble.py

test/fixtures/automl.py

eddiebergman · 2022-05-19T07:53:21Z

test/test_automl/test_fit_ensemble.py

-import numpy as np
-
-from autosklearn.automl import AutoML
-
-import pytest
-from pytest_cases import filters as ft
-from pytest_cases import parametrize, parametrize_with_cases
-
-import test.test_automl.cases as cases
-
-
-@parametrize("ensemble_size", [-10, -1, 0])
-@parametrize_with_cases("automl", cases=cases, filter=~ft.has_tag("fitted"))
-def test_non_positive_ensemble_size_raises(
-    tmp_dir: str,
-    automl: AutoML,
-    ensemble_size: int,
-) -> None:
-    """
-    Parameters
-    ----------
-    automl: AutoML
-        The AutoML object to test
-
-    ensemble_size : int
-        The ensemble size to use
-
-    Expects
-    -------
-    * Can't fit ensemble with non-positive ensemble size
-    """
-    dummy_data = np.array([1, 1, 1])
-
-    with pytest.raises(ValueError):
-        automl.fit_ensemble(dummy_data, ensemble_size=ensemble_size)


Should probably replaced by it's new counterpart behaviour in that no ensemble_class will raise.

I can do so, but I'm not 100% if it makes a lot of sense. The user would have to set ensemble_class=None in both the constructor and fit_ensemble.

…s around random state for ensembles

eddiebergman · 2022-05-19T14:06:18Z

I think SMAC does the min max scaling and having thought about it then, I don't see another good alternative

… into moo_ensemble

eddiebergman · 2022-05-20T11:07:18Z

I've disabled docs and while I was at it, dist-check and pre-commit while only have one set of unit-tests run (Ubuntu, source, 3.7). This is because we could accidentally destroy all our free minutes with private repos and then have no automated testing available. I figure these made sense but am willing to change it if needed.

We can always re-enable them at the end

autosklearn/ensemble_building/builder.py

mfeurer · 2022-05-20T14:38:12Z

autosklearn/ensemble_building/builder.py

+            by=lambda r: has_metrics(r) and tangible_losses(r),
        )
        all_discarded.update(discarded)
+        print("all_discarded", all_discarded)


This print should not be here I guess.

Nope, twas some debugging, good spot :)

mfeurer requested review from KEggensperger and eddiebergman May 16, 2022 11:57

eddiebergman reviewed May 16, 2022

View reviewed changes

mfeurer requested a review from eddiebergman May 16, 2022 18:43

eddiebergman reviewed May 16, 2022

View reviewed changes

autosklearn/automl.py Outdated Show resolved Hide resolved

autosklearn/automl.py Outdated Show resolved Hide resolved

examples/40_advanced/example_multi_objective.py Show resolved Hide resolved

autosklearn/automl.py Outdated Show resolved Hide resolved

Load single best model as fallback (automl#1479)

becbd07

* Load single best model as fallback * Update estimators.py * Improve comment in code. * Fix meta-data generation test, potentially improve model loading

eddiebergman reviewed May 19, 2022

View reviewed changes

mfeurer added 18 commits May 19, 2022 13:53

First draft of multi-objective ensembling

531a784

Improve typing of the ensemble module

2734d5a

Fail in the dummy predictor if one metric cannot be computed

70f264d

Add in some Feedback from Eddie

4d2ab8e

Allow for more than two metrics, use more diverse weight samples, pas…

1f95e67

…s around random state for ensembles

Add function to extract the pareto front

d882e2e

One more function where we need to pass "metrics"

98b6f25

Change compute intensity of ensemble builder

2e7c97e

Update one more unit test

dbb5cbb

Include feedback, simplify code, include debugging code

0623956

Bugfixes and debugging

277bcd6

Fix example?

2cebb78

Make the ensemble an argument to Auto-sklearn

3293b43

Factor out multi objective wrapper into its own file

16868bd

fix unit tests

619f926

Fix examples

e1b5501

Implement caching and multi-objective ABLE

a66c0ea

Fix unit tests by adding fixture argument

5dfb27b

mfeurer force-pushed the moo_ensemble branch from a319fc4 to 5dfb27b Compare May 19, 2022 12:03

mfeurer added 2 commits May 19, 2022 14:59

Eddie's feedback

1725a27

Fix single best: add abstract method

5d020cd

mfeurer and others added 19 commits May 19, 2022 17:16

Documentation

4de373a

Fix unittests

3942f71

Add roundrobin selector

8ac4fdc

Change to multiobjective candidate selection for runs

e2d963d

Remove occurences of performance_range_threshold

172efa1

Fix unit tests

d78ff08

Fix current single objective based tests

14c3c35

Update tests for real runs

a629ab3

Merge branch 'moo_ensemble' of github.com:automl-private/auto-sklearn…

dcfd0bb

… into moo_ensemble

Restrict one test to no longer test the no-ensemble case

e41d17e

Add logger message about memory limitations

fbaa1fa

requires_loss_update accepts metrics like rest

f5e89a3

Test max_models with round robin MOO selection

d2eb712

Fix one more unit test

7af90db

Add multi-objective dummy selection

c470316

Select best candidate if non left with MOO

c2aedb0

Merge branch 'moo_ensemble' of github.com:automl-private/auto-sklearn…

1bf4cd0

… into moo_ensemble

Disable most test setups to save private minutes

5630506

Fix workflow?

3c3cc8f

mfeurer commented May 20, 2022

View reviewed changes

Add in loss scaling for MOO weighting

930cc34

mfeurer requested a review from eddiebergman May 23, 2022 16:38

Update for paper version?

1ea043d

		ensemble_class: Type[AbstractEnsemble] \| None = EnsembleSelection,
		ensemble_kwargs: Dict[str, Any] \| None = None,

Moo ensemble #3

Are you sure you want to change the base?

Moo ensemble #3

Uh oh!

Conversation

mfeurer commented May 16, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mfeurer commented May 18, 2022

Uh oh!

eddiebergman left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eddiebergman commented May 19, 2022

Uh oh!

eddiebergman commented May 20, 2022

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

mfeurer commented May 16, 2022 •

edited

Loading