Make the metric call function async #758

ZiTao-Li · 2025-09-12T04:16:57Z

AgentScope Version

1.0.2

Description

Change the __call__ function of metric to async function for more general usage (e.g., LLM as a judge metrics)

update related modules
update tutorial
add a test for the evaluators in evaluate module

Checklist

Please check the following items before code is ready to be reviewed.

Code has been formatted with pre-commit run --all-files command
All tests are passing
Docstrings are in Google style
Related documentation has been updated (e.g. links, examples, etc.)
Code is ready for review

DavdGao

Please see inline comments, others look good to me.

src/agentscope/evaluate/_evaluator/_ray_evaluator.py

Copilot

Pull Request Overview

This PR changes the metric __call__ function from synchronous to asynchronous to support more general usage patterns, particularly LLM-as-a-judge metrics that require async operations.

Key changes:

Made the MetricBase.__call__ method abstract and async
Updated all evaluator classes to handle async metric calls
Added comprehensive test coverage for the evaluation module

Reviewed Changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
tests/evaluation_test.py	Added comprehensive test suite for evaluators with async metric support
src/agentscope/evaluate/_task.py	Made task evaluation method async to support async metric calls
src/agentscope/evaluate/_metric_base.py	Changed abstract metric `__call__` method to async
src/agentscope/evaluate/_evaluator/_ray_evaluator.py	Refactored Ray evaluator to use async actors and proper async handling
src/agentscope/evaluate/_evaluator/_general_evaluator.py	Updated general evaluator to handle async metric evaluation
src/agentscope/evaluate/_ace_benchmark/_ace_metric.py	Made ACE benchmark metrics async-compatible
docs/tutorial/zh_CN/src/task_eval.py	Updated Chinese tutorial example with async metric and corrected name
docs/tutorial/en/src/task_eval.py	Updated English tutorial example with async metric and corrected name

Comments suppressed due to low confidence (1)

src/agentscope/evaluate/_evaluator/_ray_evaluator.py:1

Using __file__ in py_modules for Ray runtime_env is incorrect. __file__ refers to the current Python file being executed, but py_modules expects module names or paths to modules that should be made available to Ray workers. This will likely cause import errors in Ray workers.

# -*- coding: utf-8 -*-

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

DavdGao

LGTM

ZiTao-Li added 3 commits September 11, 2025 15:47

let metric call function be async

d6af053

update

3e6244a

Merge branch 'refs/heads/main' into zitao/async_run_eval

24b0929

ZiTao-Li requested a review from DavdGao September 12, 2025 04:22

ZiTao-Li added Ready for Review Evaluation Evaluation related PR labels Sep 12, 2025

DavdGao reviewed Sep 15, 2025

View reviewed changes

src/agentscope/evaluate/_evaluator/_ray_evaluator.py Show resolved Hide resolved

DavdGao requested a review from Copilot September 15, 2025 01:47

Copilot AI reviewed Sep 15, 2025

View reviewed changes

DavdGao approved these changes Sep 16, 2025

View reviewed changes

DavdGao merged commit 811fb28 into agentscope-ai:main Sep 16, 2025
10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Make the metric call function async #758

Make the metric call function async #758

Uh oh!

ZiTao-Li commented Sep 12, 2025 •

edited

Loading

Uh oh!

DavdGao left a comment

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

DavdGao left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Make the metric call function async #758

Make the metric call function async #758

Uh oh!

Conversation

ZiTao-Li commented Sep 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

AgentScope Version

Description

Checklist

Uh oh!

DavdGao left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

DavdGao left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ZiTao-Li commented Sep 12, 2025 •

edited

Loading