Skip to content

Add Tag-bench in agent_eval#230

Merged
lvliang-intel merged 12 commits intoopea-project:mainfrom
minmin-intel:tag-bench-dev
Apr 3, 2025
Merged

Add Tag-bench in agent_eval#230
lvliang-intel merged 12 commits intoopea-project:mainfrom
minmin-intel:tag-bench-dev

Conversation

@minmin-intel
Copy link
Copy Markdown
Collaborator

Description

  1. Add TAG-bench for evaluating SQL agents.
  2. reorg agent_eval folder now that we have two benchmarks. extract out common elements like how to launch vllm-gaudi. and added a new readme.
  3. update crag_eval given the new changes in OPEA agent code.

Issues

#224

Type of change

List the type of change like below. Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds new functionality)
  • Breaking change (fix or feature that would break existing design and interface)

Dependencies

no new ones

Tests

All commands in readme were tested locally.

Signed-off-by: minmin-intel <minmin.hou@intel.com>
Signed-off-by: minmin-intel <minmin.hou@intel.com>
Signed-off-by: minmin-intel <minmin.hou@intel.com>
Signed-off-by: minmin-intel <minmin.hou@intel.com>
Signed-off-by: minmin-intel <minmin.hou@intel.com>
Signed-off-by: minmin-intel <minmin.hou@intel.com>
Copy link
Copy Markdown
Collaborator

@lkk12014402 lkk12014402 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lvliang-intel lvliang-intel merged commit 69e018a into opea-project:main Apr 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature] add TAG-bench for agent_eval, publish TAG-bench results for sql agent llama

3 participants