Support HELMET by howard-yen · Pull Request #182 · opea-project/GenAIEval

howard-yen · 2024-10-29T15:16:14Z

Description

Added HELMET to evaluation. HELMET is a comprehensive evaluation benchmark for long-context language models with application-centric tasks. It covers seven different categories: Retrieval-augmented Generation, Passage Re-ranking, Generation with Citations, ICL, Long-document QA, Summarization, and Synthetic Recall. These datasets are complemented with controllable length and robust evaluation.

Details and instructions for running the benchmark are in HELMET/README.md.

Paper link.

Issues

n/a

Type of change

List the type of change like below. Please delete options that are not relevant.

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds new functionality)
Breaking change (fix or feature that would break existing design and interface)

Dependencies

n/a

Tests

Describe the tests that you ran to verify your changes.

To test running the HELMET benchmark, we use python eval.py --config configs/icl.yaml in the HELMET directory.
We also tested the whole suite with bash scripts/run_eval.sh.

for more information, see https://pre-commit.ci

minmin-intel

LGTM

lkk12014402 · 2024-11-01T07:24:25Z

please fix the pre-commit.ci issues

Signed-off-by: Howard Yen <hyen@princeton.edu>

for more information, see https://pre-commit.ci

* add longbench Signed-off-by: Xinyao Wang <xinyao.wang@intel.com> * refine readme Signed-off-by: Xinyao Wang <xinyao.wang@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Xinyao Wang <xinyao.wang@intel.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Howard Yen <hyen@princeton.edu>

Signed-off-by: Howard Yen <hyen@princeton.edu>

for more information, see https://pre-commit.ci

Signed-off-by: Howard Yen <hyen@princeton.edu>

for more information, see https://pre-commit.ci

Signed-off-by: Howard Yen <hyen@princeton.edu>

Support HELMET

41d4d92

howard-yen requested a review from lkk12014402 as a code owner October 29, 2024 15:16

[pre-commit.ci] auto fixes from pre-commit.com hooks

66269c7

for more information, see https://pre-commit.ci

minmin-intel approved these changes Oct 29, 2024

View reviewed changes

lkk12014402 approved these changes Nov 1, 2024

View reviewed changes

howard-yen and others added 13 commits November 1, 2024 13:50

fix error messages

e7e5ec9

Signed-off-by: Howard Yen <hyen@princeton.edu>

merge

ccc75c7

Signed-off-by: Howard Yen <hyen@princeton.edu>

[pre-commit.ci] auto fixes from pre-commit.com hooks

19e84e2

for more information, see https://pre-commit.ci

Support HELMET

a633a12

Signed-off-by: Howard Yen <hyen@princeton.edu>

fix error messages

6f263fe

Signed-off-by: Howard Yen <hyen@princeton.edu>

[pre-commit.ci] auto fixes from pre-commit.com hooks

c077f69

for more information, see https://pre-commit.ci

merge utils

db0fcbc

Signed-off-by: Howard Yen <hyen@princeton.edu>

update alce

ad8f381

[pre-commit.ci] auto fixes from pre-commit.com hooks

14b0fe0

for more information, see https://pre-commit.ci

update spelling

c985e4a

update spelling

0e81b47

Signed-off-by: Howard Yen <hyen@princeton.edu>

update

ee6462b

Signed-off-by: Howard Yen <hyen@princeton.edu>

minmin-intel merged commit 4c8f048 into opea-project:main Nov 1, 2024

joshuayao added this to the v1.1 milestone Nov 7, 2024

joshuayao added the r1.1 v1.1 release label Nov 11, 2024

ashahba mentioned this pull request Nov 22, 2024

v1.1 release notes opea-project/docs#257

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support HELMET#182

Support HELMET#182
minmin-intel merged 15 commits intoopea-project:mainfrom
howard-yen:main

howard-yen commented Oct 29, 2024

Uh oh!

minmin-intel left a comment

Uh oh!

lkk12014402 commented Nov 1, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

howard-yen commented Oct 29, 2024

Description

Issues

Type of change

Dependencies

Tests

Uh oh!

minmin-intel left a comment

Choose a reason for hiding this comment

Uh oh!

lkk12014402 commented Nov 1, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants