Skip to content

Conversation

@ingyuseong
Copy link

Metrics for KLUE-MRC (Korean Language Understanding Evaluation — Machine Reading Comprehension)

Adding metrics for KLUE-MRC.
KLUE-MRC is very similar to SQuAD 2.0 but has a slightly different format which is why I added metrics for KLUE-MRC.

Specifically, in the case of LM Eval Harness, it leverages the scoring script of SQuAD to evaluate SQuAD 2.0 and KorQuAD. But the script isn't suitable for KLUE-MRC because KLUE-MRC is a bit different from SQuAD 2.0. And this is why I added the scoring script for KLUE-MRC.

  • All tests passed
  • Added a metric card (referred the metric card of SQuAD 2.0)
  • Compatibility test with LM Eval Harness passed

References

@mariosasko
Copy link
Collaborator

The metrics API in datasets is deprecated as of version 2.0, and evaulate is our new library for metrics. You can add a new metric to it by following these steps.

@ingyuseong ingyuseong closed this Jul 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants