Add KLUE-MRC metrics #6002

ingyuseong · 2023-07-03T12:11:10Z

Metrics for KLUE-MRC (Korean Language Understanding Evaluation — Machine Reading Comprehension)

Adding metrics for KLUE-MRC.
KLUE-MRC is very similar to SQuAD 2.0 but has a slightly different format which is why I added metrics for KLUE-MRC.

Specifically, in the case of LM Eval Harness, it leverages the scoring script of SQuAD to evaluate SQuAD 2.0 and KorQuAD. But the script isn't suitable for KLUE-MRC because KLUE-MRC is a bit different from SQuAD 2.0. And this is why I added the scoring script for KLUE-MRC.

All tests passed
Added a metric card (referred the metric card of SQuAD 2.0)
Compatibility test with LM Eval Harness passed

References

mariosasko · 2023-07-03T15:34:17Z

The metrics API in datasets is deprecated as of version 2.0, and evaulate is our new library for metrics. You can add a new metric to it by following these steps.

ingyuseong added 4 commits July 2, 2023 15:59

Add metrics for KLUE-MRC (F1 and EM)

5938f5f

Add README.md for KLUE-MRC metric

4a696a0

Update metrics for KLUE-MRC

3533bb7

Update README.md for KLUE-MRC metric

91a293d

ingyuseong closed this Jul 9, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add KLUE-MRC metrics #6002

Add KLUE-MRC metrics #6002

Uh oh!

ingyuseong commented Jul 3, 2023

Uh oh!

mariosasko commented Jul 3, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add KLUE-MRC metrics #6002

Add KLUE-MRC metrics #6002

Uh oh!

Conversation

ingyuseong commented Jul 3, 2023

Metrics for KLUE-MRC (Korean Language Understanding Evaluation — Machine Reading Comprehension)

References

Uh oh!

mariosasko commented Jul 3, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants