Support multilabel metrics #2589

albertvillanova · 2021-07-05T08:19:25Z

Currently, multilabel metrics are not supported because predictions and references are defined as Value("int32").

This PR creates a new feature type OptionalSequence which can act as either Value("int32") or Sequence(Value("int32")), depending on the data passed.

Close #2554.

lhoestq · 2021-07-06T09:35:37Z

Hi ! Thanks for the fix :)

If I understand correctly, OptionalSequence doesn't have an associated arrow type that we know in advance unlike the other feature types, because it depends on the type of the examples.

For example, I tested this and it raises an error:

import datasets as ds
import pyarrow as pa

features = ds.Features({"a": ds.features.OptionalSequence(ds.Value("int32"))})
batch = {"a": [[0]]}

writer = ds.ArrowWriter(features=features, stream=pa.BufferOutputStream())
writer.write_batch(batch)
# ArrowInvalid: Could not convert [0] with type list: tried to convert to int

This error happens because features.type is StructType(struct<a: int32>).

Another way to add support for multilabel would be to have several configurations for these metrics. By default it would set the features without sequences, and for the multi label configuration it would use features with sequences. Let me know what you think

albertvillanova · 2021-07-06T10:35:07Z

Hi @lhoestq, thanks for your feedback :)

Definitely, your suggested approach is simpler. I am going to refactor all my PR unless we could envision some other use cases where an OptionalSequence might be convenient, but for now I can't think of any...

lhoestq

Great thanks :)

fcakyon · 2022-07-29T09:48:25Z

@albertvillanova @lhoestq I couldnt find the related docs in F1 card: https://huggingface.co/spaces/evaluate-metric/f1

How do I perform multilabel F1 evaluation using evaluate package?

albertvillanova · 2022-07-29T10:47:35Z

I was going to transfer your question to the evaluate GitHub repository, but I saw you have already done it (and even opened a PR):

Thanks, @fcakyon.

fcakyon · 2022-07-29T10:56:24Z

Sorry to bomb you on multiple channels 😅 @albertvillanova, I have solved my problems, and opened a PR so that others also don't get confused 👍

albertvillanova added 2 commits July 5, 2021 10:05

Test metric with optional sequence for multiclass and multilabel

b3b8572

Implement OptionalSequence

b91d0f8

albertvillanova marked this pull request as draft July 5, 2021 08:30

albertvillanova added 5 commits July 5, 2021 10:37

Fix OptionalSequence initialization

3c8a95d

Change OptionalSequence default pa_type

e8e139c

Support multilabel metrics

8618175

Test optional sequence feature

b3b49cf

Fix import

ab05122

albertvillanova marked this pull request as ready for review July 5, 2021 09:29

albertvillanova and others added 4 commits July 7, 2021 19:06

Change test with multilabel config

207bde1

Remove OptionalSequence

2e07e61

Fix metrics to support multilabel

5584484

minor rename

35180ec

lhoestq approved these changes Jul 8, 2021

View reviewed changes

lhoestq merged commit 8dcf377 into huggingface:master Jul 8, 2021

albertvillanova added this to the 1.10 milestone Jul 12, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support multilabel metrics #2589

Support multilabel metrics #2589

Uh oh!

albertvillanova commented Jul 5, 2021

Uh oh!

lhoestq commented Jul 6, 2021 •

edited

Loading

Uh oh!

albertvillanova commented Jul 6, 2021

Uh oh!

lhoestq left a comment

Uh oh!

fcakyon commented Jul 29, 2022

Uh oh!

albertvillanova commented Jul 29, 2022

Uh oh!

fcakyon commented Jul 29, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Support multilabel metrics #2589

Support multilabel metrics #2589

Uh oh!

Conversation

albertvillanova commented Jul 5, 2021

Uh oh!

lhoestq commented Jul 6, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

albertvillanova commented Jul 6, 2021

Uh oh!

lhoestq left a comment

Choose a reason for hiding this comment

Uh oh!

fcakyon commented Jul 29, 2022

Uh oh!

albertvillanova commented Jul 29, 2022

Uh oh!

fcakyon commented Jul 29, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

lhoestq commented Jul 6, 2021 •

edited

Loading