Skip to content
This repository was archived by the owner on Aug 4, 2023. It is now read-only.

Conversation

@mfeurer
Copy link

@mfeurer mfeurer commented May 16, 2022

Hi everybody,

Here's a draft PR on how to pass custom ensemble classes to Auto-sklearn, and how to use multi-objective ensembles. is currently in the private fork because we don't want to make it public yet. As before, I'll add unit tests and doc strings as soon as folks are happy with the API.

Concretely, this PR adds:

  1. A new interface to pass in custom ensemble strategies.
  2. Full support for multi-objective ensemble building.
  3. New functionality to retrieve the identifiers and weights of an ensemble.
  4. Because of that I was able to improve type definitions for the ensemble building submodule. More concretely, there are no more ensemble files exempt from type checking.
  5. Three new ensemble classes:
    1. A naive wrapper class that repeatedly calls a single-objective ensemble with a weighted metric function.
    2. An efficient implementation of Agnostic Bayesian Learning of Ensembles that can be used instead of the traditional ensemble selection.
    3. An efficient multi-objective implementation of ABLE (which would be novel methodology)

Open TODOs:

  • Weighting the metrics in the case of a generic ensemble
  • Weighting the metrics in case of multi-objective ABLE
  • Docstrings
  • Unit tests

@mfeurer mfeurer requested a review from eddiebergman May 16, 2022 18:43
@mfeurer
Copy link
Author

mfeurer commented May 18, 2022

This PR is ready from my side except for the missing documentation, unit tests and ... weighting. While I know how to do the 1st two, I'm not 100% sure how to do weighting of metrics properly.

The ParEGO paper states:

ParEGO begins by normalizing the � cost functions with respect to the known (or estimated) limits of the cost space, so that each cost function lies in the range [0,1].

but gives no directions on how to estimate such limits. For bounded metrics (accuracy, AUC) we could use the known bounds. However, these might be loose (the ensemble will never observe something that is worse than 0.5 AUC because the ensemble builder will drop that before) or unknown (What's the upper bound of RMSE?).

Here's a potential remedy:

  1. We pass the dummy score to MO-wrapper
  2. We compute the score for each model prior to running the ensemble (or pass it from the ensemble builder)
  3. We use these as minimum and maximum to scale between zero and one

What do you all think? After writing this I am currently considering to pass the scores computed in the ensemble builder to the ensemble method as an additional argument.

* Load single best model as fallback

* Update estimators.py

* Improve comment in code.

* Fix meta-data generation test, potentially improve model loading
Copy link

@eddiebergman eddiebergman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor changes but mostly it looks all good. As you mentioned, some doc and test needed.

The only major discussion point is what to do with the fact the API for specifying Ensembling has now changed. ensemble_size has been removed and I feel like this is a parameter quite a few people will have specified. If we can keep it so that default behaviour is maintained with ensemble_size specified, I would be happy with that.

Comment on lines +806 to 811
if self._ensemble_class is not None:
raise ValueError(
"Not starting ensemble builder because there "
"is no time left. Try increasing the value "
"of time_left_for_this_task."
)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't really notice this before but does this mean we can end up where a user calls fit but because of bad time management on autosklearn's part, the user gets a ValueError? It would make more sense to make this a RuntimeError but maybe it's not worth changing at this point.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, they should get an ensemble with a DummyEstimator.

Comment on lines +47 to +48
ensemble_class: Type[AbstractEnsemble] | None = EnsembleSelection,
ensemble_kwargs: Dict[str, Any] | None = None,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the main API breaking change, I imagine most power users will have ensemble_size specified and maybe we should deprecate it, maintaining old default behavior by inserting it into ensemble_kwargs? We can issue a warning:

# In function defnition
    ensemble_size: int | None = None
    

# User specified `ensemble_size` explicitly, warn them about deprecation
if ensemble_size is not None: 
    warnings.warn(f"`ensemble_size` has been deprecated, please use `ensemble_kwargs = { 'ensemble_size': {ensemble_size} }. Inserting `ensemble_size` into `ensemble_kwargs` for now.")
    # Keep consistent behaviour
    if ensemble_kwargs is None:
        ensemble_kwargs = {"ensemble_size": ensemble_size}
    else:
        ensemble_kwargs["ensemble_size"] = ensemble_size
else:
    # Old default behaviour, no need to warn here as they havn't set `ensemble_size`
    if ensemble_kwargs is None:
        ensemble_kwargs = {"ensemble_size": 50}

Porbably not the cleanest solution but hopefully it gets the point across.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned above I'll deprecate this. I'll deprecate this here so that automl.py will only accept ensemle_kwargs

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated this.

Comment on lines -1 to -35
import numpy as np

from autosklearn.automl import AutoML

import pytest
from pytest_cases import filters as ft
from pytest_cases import parametrize, parametrize_with_cases

import test.test_automl.cases as cases


@parametrize("ensemble_size", [-10, -1, 0])
@parametrize_with_cases("automl", cases=cases, filter=~ft.has_tag("fitted"))
def test_non_positive_ensemble_size_raises(
tmp_dir: str,
automl: AutoML,
ensemble_size: int,
) -> None:
"""
Parameters
----------
automl: AutoML
The AutoML object to test
ensemble_size : int
The ensemble size to use
Expects
-------
* Can't fit ensemble with non-positive ensemble size
"""
dummy_data = np.array([1, 1, 1])

with pytest.raises(ValueError):
automl.fit_ensemble(dummy_data, ensemble_size=ensemble_size)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should probably replaced by it's new counterpart behaviour in that no ensemble_class will raise.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can do so, but I'm not 100% if it makes a lot of sense. The user would have to set ensemble_class=None in both the constructor and fit_ensemble.

@eddiebergman
Copy link

I think SMAC does the min max scaling and having thought about it then, I don't see another good alternative

@eddiebergman
Copy link

I've disabled docs and while I was at it, dist-check and pre-commit while only have one set of unit-tests run (Ubuntu, source, 3.7). This is because we could accidentally destroy all our free minutes with private repos and then have no automated testing available. I figure these made sense but am willing to change it if needed.

We can always re-enable them at the end

by=lambda r: has_metrics(r) and tangible_losses(r),
)
all_discarded.update(discarded)
print("all_discarded", all_discarded)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This print should not be here I guess.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, twas some debugging, good spot :)

@mfeurer mfeurer requested a review from eddiebergman May 23, 2022 16:38
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants