-
Notifications
You must be signed in to change notification settings - Fork 301
Docs for EvaluationSuite #340
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
The documentation is not available anymore as the PR was closed or merged. |
e64cd83 to
f8da2a6
Compare
lhoestq
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool thanks ! I think you can also mention it to the quick tour :)
docs/source/evaluation_suite.mdx
Outdated
|
|
||
| ```python | ||
| {'glue/cola': {'accuracy': 0.0, 'total_time_in_seconds': 0.9766696180449799, 'samples_per_second': 10.238876909079256, 'latency_in_seconds': 0.09766696180449798}, 'glue/sst2': {'accuracy': 0.5, 'total_time_in_seconds': 1.1422595420153812, 'samples_per_second': 8.754577775166744, 'latency_in_seconds': 0.11422595420153811}, 'glue/qqp': {'accuracy': 0.6, 'total_time_in_seconds': 1.3553926559980027, 'samples_per_second': 7.377935800188323, 'latency_in_seconds': 0.13553926559980026}, 'glue/mrpc': {'accuracy': 0.6, 'total_time_in_seconds': 2.021696529001929, 'samples_per_second': 4.946340786832532, 'latency_in_seconds': 0.2021696529001929}, 'glue/mnli': {'accuracy': 0.2, 'total_time_in_seconds': 2.0380110969999805, 'samples_per_second': 4.9067446270142145, 'latency_in_seconds': 0.20380110969999807}, 'glue/qnli': {'accuracy': 0.3, 'total_time_in_seconds': 2.082032073987648, 'samples_per_second': 4.802999975330509, 'latency_in_seconds': 0.20820320739876477}, 'glue/rte': {'accuracy': 0.7, 'total_time_in_seconds': 2.8592985830036923, 'samples_per_second': 3.4973612267855576, 'latency_in_seconds': 0.2859298583003692}, 'glue/wnli': {'accuracy': 0.5, 'total_time_in_seconds': 1.5406486629508436, 'samples_per_second': 6.490772517107661, 'latency_in_seconds': 0.15406486629508437}} | ||
| ``` No newline at end of file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(nit) Would be nice to show it as a pandas DataFrame for readability
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call, the result is now a list of dicts so it can be easily transformed into a dataframe. I've added that to the example 😄
| self.preprocessor = lambda x: {"text": x["text"].lower()} | ||
| self.suite = [ | ||
| SubTask( | ||
| task_type="text-classification", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you list the available task types maybe ? Or redirect to their docs ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a link to the supported tasks on the Evaluator docs so we don't have to maintain the list in two places!
lvwerra
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @mathemakitten this great, thanks for working on this. I left a few comments, happy to discuss further if you want.
Co-authored-by: Leandro von Werra <[email protected]>
Co-authored-by: Leandro von Werra <[email protected]>
Co-authored-by: Leandro von Werra <[email protected]>
0527ff4 to
7da77ed
Compare
lvwerra
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few minor comments, then we can merge 🚀
| >>> suite = EvaluationSuite.load('mathemakitten/glue-evaluation-suite') | ||
| >>> results = suite.run("gpt2") | ||
| | accuracy | total_time_in_seconds | samples_per_second | latency_in_seconds | task_name | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would do the same here and remove the table from the codeblock so it's actually rendered as a nice table.
Co-authored-by: Leandro von Werra <[email protected]>
Adding docs for
EvaluationSuite.