First draft of multi-objective optimization #1455

mfeurer · 2022-05-03T12:53:30Z

This is the very first implementation of multi-objective optimization in Auto-sklearn, solving issue #1317. It allows users passing in a list of metrics instead of a single metric. Auto-sklearn then solves a multi-objective optimization problem by using SMAC's new multi-objective capabilities.

This PR has several known limitation, some of which can/will be fixed in a future PR, and some for which it is unclear how to approach them in the first place:

The ensemble optimizes only the first metric:
This will be addressed in a future PR.
_load_best_individual_model() returns the best model according to the first metric:
It is unclear how to make this multi-objective as this is a clear fallback solution. My only two suggestions would be 1) to always return the first metric and document that the first metric is the tie breaking metric (currently implemented), or 2) use the Copula mixing approach demonstrated in Figure 6 of http://proceedings.mlr.press/v119/salinas20a/salinas20a.pdf to return a scalarized solution. However, this can lead to unpredictable selections.
score():
This function suffers from the same issue as load_best_individual_model as there is no clear definition which model is the best.
_get_runhistory_models_performance():
This function suffers from the same issue as load_best_individual_model as there is no clear definition which model is the best.
sprint_statistics()
This function suffers from the same issue as load_best_individual_model as there is no clear definition which model is the best.
refit()
This function suffers from the same issue as load_best_individual_model as there is no clear definition which model is the best.
Leaderboard:
- The first metric is currently reported as "cost" -> If there is more than one metric, we report them as cost_0, cost_1, etc
- The entries will be sorted by the first metric by default, with all additional metrics being tie breakers
- The current sorting ["cost", "rank"] appears to me to be the same as would be sorting by ["cost"] as "rank" should have the same order. I therefore propose to use the "model_id" as a secondary metric here
cv_results_
Updated to follow the output of scikit-learn, see Attributes in here
Accessing the Pareto front of individual models and ensembles
It is so far completely unclear how to achieve this. Picking a model from the Pareto front would make Auto-sklearn non-interactive. Therefore, it might be a good idea to add a function to return all ensembles on the Pareto front as "raw" ensemble objects that can be further used by the user, or to one after the other load different models as the main model.

TODOs:

autosklearn/automl.py

autosklearn/evaluation/abstract_evaluator.py

autosklearn/metrics/__init__.py

autosklearn/evaluation/train_evaluator.py

Co-authored-by: Katharina Eggensperger <[email protected]>

eddiebergman

Don't need to change, just if you see it and get a chance :) I think it looks a little Cleaner with as little square brackets [] from Optional[] and Union[] as possible but I'd merge without it. It also translates better with types in sphinx docs if you use the same notation in the docstrings.

eddiebergman · 2022-05-04T14:24:06Z

autosklearn/automl.py

        smac_scenario_args: Optional[Mapping] = None,
        logging_config: Optional[Mapping] = None,
-        metric: Optional[Scorer] = None,
+        metric: Optional[Scorer | Sequence[Scorer]] = None,


Not neccessary, just something to know Optional[X] == Union[X, None] == X | None
i..e you could write Scorer | Sequence[Scorer] | None = None

That's look nice, will do.

eddiebergman · 2022-05-04T14:26:28Z

autosklearn/evaluation/__init__.py


 def get_cost_of_crash(
-    metric: Union[Scorer, List[Scorer], Tuple[Scorer]]
+    metric: Union[Scorer | Sequence[Scorer]],


Like wise here, Union[X | Y] == X | Y, the | essentially is just the infix operator for Union in the same way you have + instead of add(x, y).

i.e. metric: Scorer | Sequence[Scorer]

Thanks for catching.

eddiebergman

Sorry, more files I didn't see. I'm starting to wonder if it makes sense to have something like a MetricGroup class? A lot of the code changes seem to just be handling that case of 1, many or None metrics.

eddiebergman · 2022-05-04T14:31:59Z

autosklearn/automl.py

-            val_score = metric._optimum - (metric._sign * run_value.cost)
+            cost = run_value.cost
+            if not isinstance(self._metric, Scorer):
+                cost = cost[0]


I assume this is a point of API conflict? It would be good to know about all the metrics for a model but at the end of the day, we currently only support one and so we choose the first?

Yes, it would be good to know about all the metrics. I will look into returning multiple metrics here (should be possible).

Please see my comment wrt this in the PR comment at the top.

autosklearn/ensemble_builder.py

autosklearn/evaluation/abstract_evaluator.py

autosklearn/automl.py

autosklearn/evaluation/__init__.py

autosklearn/evaluation/test_evaluator.py

autosklearn/evaluation/train_evaluator.py

autosklearn/metrics/__init__.py

codecov · 2022-05-04T15:09:35Z

Codecov Report

Merging #1455 (d432e07) into development (daa9ad6) will increase coverage by 0.00%.
The diff coverage is 81.95%.

@@              Coverage Diff              @@
##           development    #1455    +/-   ##
=============================================
  Coverage        84.31%   84.32%            
=============================================
  Files              147      147            
  Lines            11284    11397   +113     
  Branches          1934     1986    +52     
=============================================
+ Hits              9514     9610    +96     
- Misses            1256     1263     +7     
- Partials           514      524    +10

mfeurer · 2022-05-09T15:02:57Z

Alright, I think the functionality of this PR is complete for now. I will add unit tests after another round of feedback.

autosklearn/automl.py

autosklearn/ensembles/ensemble_selection.py

autosklearn/estimators.py

autosklearn/evaluation/__init__.py

eddiebergman · 2022-05-10T12:23:10Z

Replies to the list of items in the PR first.

The ensemble optimizes only the first metric:
This will be addressed in a future PR.

Sounds good, we need to know what to do with metrics in the ensemble first.

_load_best_individual_model() returns the best model according to the first metric:
It is unclear how to make this multi-objective as this is a clear fallback solution. My only two suggestions would be 1) to always return the first metric and document that the first metric is the tie breaking metric (currently implemented), or 2) use the Copula mixing approach demonstrated in Figure 6 of http://proceedings.mlr.press/v119/salinas20a/salinas20a.pdf to return a scalarized solution. However, this can lead to unpredictable selection

An extended version of the first solution, (cost_0, cost_1, ...) <= (cost_0, cost_1, ...) such that if cost_0 is equal, it will then compare the second one and so on. This means the order of the metrics is the order of comparison and means they must be preserved throughout.

...

Same answer, without a method to present choices we have the requirement to select for the user. I would go with the above solution

Leaderboard

Seems good, we could even extract the metrics name through f.__name__ but this means programatically using leaderboard becomes difficult.

cv_results_

Sure

Ensembles

In theory there's nothing that prevents us from making a selection at predict as we already do, using the above methods to force a selection. For giving a choice, yes we need to somehow wrap up the ensembles with their scores. This below solution would give a lot of choice but maybe there's a better way to present the info?

[
    ((cost_0, cost_1, ...), ens0),
    ((cost_0, cost_1, ...), ens1),
    ((cost_0, cost_1, ...), ens2),
]

I'll do a review now

autosklearn/estimators.py

autosklearn/evaluation/abstract_evaluator.py

autosklearn/metrics/__init__.py

autosklearn/evaluation/abstract_evaluator.py

autosklearn/estimators.py

autosklearn/metrics/__init__.py

test/test_estimators/test_estimators.py

test/test_evaluation/test_train_evaluator.py

test/test_metric/test_metrics.py

eddiebergman · 2022-05-12T14:41:59Z

test/test_metric/test_metrics.py

+        solution=y_true,
+        prediction=y_pred,
+        task_type=BINARY_CLASSIFICATION,
+        metrics=[autosklearn.metrics.accuracy, autosklearn.metrics.accuracy],


[Non-Blocking, Q]
Using the same metric twice seems like it should raise an error to me. Given that the behavior is no error raised, the output is good though. Should we keep it as is?

Do you mean we should allow the same metric to be present in both metrics and scoring_functions, but not the same metric twice in one of them?

I guess so, I didn't really think about the same one twice, once in metrics and once in scoring_functions.

What you stated seems reasonable to me but not a cause to block the PR if it's difficult. I can't imagine a scenario where you would want the same metric, i.e. acc twice in metrics but I could imagine a scenario where you have acc in metrics and acc in scoring_functions.

The scenario where I see this being used is purely to make getting out scores for autosklearn to be done in one place, i.e. you specify acc and metric_z for the optimization with metrics and you specify acc, balanced_acc and f1_score for the scoring_functions when you later want to evaluate the autosklearn run.

test/test_metric/test_metrics.py

mfeurer mentioned this pull request May 3, 2022

[Question] Multi-objective auto-sklearn? #1317

Closed

eddiebergman reviewed May 3, 2022

View reviewed changes

mfeurer requested a review from KEggensperger May 3, 2022 15:04

First draft of multi-objective optimization

2885692

Co-authored-by: Katharina Eggensperger <[email protected]>

mfeurer force-pushed the MOO branch from 463f585 to 2885692 Compare May 3, 2022 15:37

mfeurer added 3 commits May 4, 2022 09:58

Feedback from Eddie

4e361b7

Make metric internally always a list

77a223e

Fix most examples

c3b5072

eddiebergman reviewed May 4, 2022

View reviewed changes

KEggensperger reviewed May 4, 2022

View reviewed changes

mfeurer added 5 commits May 5, 2022 16:09

Take further feedback into account

fc89c68

Fix unit tests

1a09c12

Fix one more example

19251e8

Add multi-objective example

177d913

Simplify internal interface

98beb3a

KEggensperger approved these changes May 9, 2022

View reviewed changes

mfeurer added 4 commits May 9, 2022 10:13

Act on further feedback

93cdbfc

Fix bug

610cda8

Update cv_results_ for multi-objective sklearn compliance

1557131

Update leaderboard for multi-objective optimization

75c3eb4

mfeurer requested review from KEggensperger and eddiebergman May 9, 2022 15:03

KEggensperger reviewed May 9, 2022

View reviewed changes

Include Feedback from Katharina

1f90ec8

eddiebergman reviewed May 10, 2022

View reviewed changes

autosklearn/estimators.py Outdated Show resolved Hide resolved

autosklearn/estimators.py Outdated Show resolved Hide resolved

autosklearn/evaluation/abstract_evaluator.py Outdated Show resolved Hide resolved

autosklearn/metrics/__init__.py Show resolved Hide resolved

Take offline feedback into account

a9227b8

mfeurer added 8 commits May 10, 2022 18:20

Take offline feedback into account

2c36d45

Eddie's feedback

1f1d218

Fix metadata generation unit test

8de1e5a

Test for metrics with the same name

08c3bd0

Fix?

fdfed85

Test CV results

fe67618

Test leaderboard for multi-objective optimization

0aa65e9

Last batch of unit tests added

b1a0c72

eddiebergman reviewed May 12, 2022

View reviewed changes

Include Eddie's feedback

d432e07

eddiebergman approved these changes May 12, 2022

View reviewed changes

mfeurer merged commit ed1bc68 into development May 12, 2022

mfeurer deleted the MOO branch May 12, 2022 19:44

First draft of multi-objective optimization #1455

First draft of multi-objective optimization #1455

Uh oh!

Conversation

mfeurer commented May 3, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

eddiebergman left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eddiebergman left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov bot commented May 4, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

mfeurer commented May 9, 2022

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

eddiebergman commented May 10, 2022

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

eddiebergman May 12, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mfeurer commented May 3, 2022 •

edited

Loading

codecov bot commented May 4, 2022 •

edited

Loading

eddiebergman May 12, 2022 •

edited

Loading