You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[`Trainer`] does not automatically evaluate model performance during training. You will need to pass [`Trainer`] a function to compute and report metrics. The 🤗 Datasets library provides a simple [`accuracy`](https://huggingface.co/metrics/accuracy) function you can load with the `load_metric` (see this [tutorial](https://huggingface.co/docs/datasets/metrics.html) for more information) function:
103
+
[`Trainer`] does not automatically evaluate model performance during training. You'll need to pass [`Trainer`] a function to compute and report metrics. The [🤗 Evaluate](https://huggingface.co/docs/evaluate/index) library provides a simple [`accuracy`](https://huggingface.co/spaces/evaluate-metric/accuracy) function you can load with the [`evaluate.load`] (see this [quicktour](https://huggingface.co/docs/evaluate/a_quick_tour) for more information) function:
104
104
105
105
```py
106
106
>>>import numpy as np
107
-
>>>from datasets importload_metric
107
+
>>>importevaluate
108
108
109
-
>>> metric = load_metric("accuracy")
109
+
>>>metric=evaluate.load("accuracy")
110
110
```
111
111
112
-
Call `compute` on `metric` to calculate the accuracy of your predictions. Before passing your predictions to `compute`, you need to convert the predictions to logits (remember all 🤗 Transformers models return logits):
112
+
Call [`~evaluate.compute`] on `metric` to calculate the accuracy of your predictions. Before passing your predictions to `compute`, you need to convert the predictions to logits (remember all 🤗 Transformers models return logits):
113
113
114
114
```py
115
115
>>>def compute_metrics(eval_pred):
@@ -341,12 +341,14 @@ To keep track of your training progress, use the [tqdm](https://tqdm.github.io/)
341
341
... progress_bar.update(1)
342
342
```
343
343
344
-
### Metrics
344
+
### Evaluate
345
345
346
-
Just like how you need to add an evaluation function to [`Trainer`], you need to do the same when you write your own training loop. But instead of calculating and reporting the metric at the end of each epoch, this time you will accumulate all the batches with [`add_batch`](https://huggingface.co/docs/datasets/package_reference/main_classes.html?highlight=add_batch#datasets.Metric.add_batch) and calculate the metric at the very end.
346
+
Just like how you added an evaluation function to [`Trainer`], you need to do the same when you write your own training loop. But instead of calculating and reporting the metric at the end of each epoch, this time you'll accumulate all the batches with [`~evaluate.add_batch`] and calculate the metric at the very end.
347
347
348
348
```py
349
-
>>>metric= load_metric("accuracy")
349
+
>>>import evaluate
350
+
351
+
>>>metric= evaluate.load("accuracy")
350
352
>>> model.eval()
351
353
>>>for batch in eval_dataloader:
352
354
...batch= {k: v.to(device) for k, v in batch.items()}
0 commit comments