Skip to content

Conversation

@KEggensperger
Copy link
Contributor

@KEggensperger KEggensperger commented May 13, 2022

This PR enables ASKL to work with a wider range of metrics and scorers. Concretely, it does the following:

  • Allow a metric to require xdata
  • Store xdata and pass it to the metric if required
  • Updates to the backend in automl-common to store and load xdata
  • Some fixed typos in the contribution guide

@codecov
Copy link

codecov bot commented May 13, 2022

Codecov Report

Merging #1475 (26ff4e5) into development (becbd07) will decrease coverage by 0.05%.
The diff coverage is 69.56%.

@@               Coverage Diff               @@
##           development    #1475      +/-   ##
===============================================
- Coverage        84.33%   84.27%   -0.06%     
===============================================
  Files              151      151              
  Lines            11437    11488      +51     
  Branches          1984     1994      +10     
===============================================
+ Hits              9645     9682      +37     
- Misses            1265     1275      +10     
- Partials           527      531       +4     

Impacted file tree graph

Copy link
Contributor

@mfeurer mfeurer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good in general, but it would be great if this would be accompanied by some unit tests, for example for the train evaluator and the metrics module.

def make_scorer(
name: str,
score_func: Callable,
*,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice

def metric_which_needs_x(solution, prediction, x_data, extra_argument):
# custom function defining accuracy
assert x_data is not None
return np.mean(solution[extra_argument, :] == prediction[extra_argument, :])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we illustrate with some arbitrary use of x_data for the example, i.e. make rows with NaNs count for twice as much?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the example with a more meaningful metric

Copy link
Contributor

@mfeurer mfeurer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks all good, a new unit test in the file test_train_evaluator.py, which ensures that the data is correctly emitted to disk, would be great to have, though.

@eddiebergman
Copy link
Contributor

I will run the benchmark on this, it's hard to implement this feature into the ensemble_builder until this is merged into dev.

@eddiebergman eddiebergman merged commit 4b21134 into development May 24, 2022
github-actions bot pushed a commit that referenced this pull request May 24, 2022
eddiebergman pushed a commit that referenced this pull request Aug 18, 2022
* FIX minor

* update submodule

* update submodule

* ADD pass xdata to metric

* update submodule

* Fix tests

* update submodule

* ADD example

* UPDATE example

* ADD extra method to concat data

* RM note

* Fix minor

* ADD more types for xdata

* FIX unittests

* FIX variable naming bug

* change varible name; fix docstring

* Rename variable

* FIX example

* Update examples/40_advanced/example_metrics.py

Co-authored-by: Matthias Feurer <[email protected]>

Co-authored-by: Matthias Feurer <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Custom Metric---passing independent variable during cross validation

4 participants