Skip to content

Commit fdd9c95

Browse files
stevhliuamyeroberts
authored andcommitted
📝 update metric with evaluate (huggingface#18535)
1 parent a25b1b3 commit fdd9c95

File tree

1 file changed

+10
-8
lines changed

1 file changed

+10
-8
lines changed

docs/source/en/training.mdx

Lines changed: 10 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -98,18 +98,18 @@ Specify where to save the checkpoints from your training:
9898
>>> training_args = TrainingArguments(output_dir="test_trainer")
9999
```
100100

101-
### Metrics
101+
### Evaluate
102102

103-
[`Trainer`] does not automatically evaluate model performance during training. You will need to pass [`Trainer`] a function to compute and report metrics. The 🤗 Datasets library provides a simple [`accuracy`](https://huggingface.co/metrics/accuracy) function you can load with the `load_metric` (see this [tutorial](https://huggingface.co/docs/datasets/metrics.html) for more information) function:
103+
[`Trainer`] does not automatically evaluate model performance during training. You'll need to pass [`Trainer`] a function to compute and report metrics. The [🤗 Evaluate](https://huggingface.co/docs/evaluate/index) library provides a simple [`accuracy`](https://huggingface.co/spaces/evaluate-metric/accuracy) function you can load with the [`evaluate.load`] (see this [quicktour](https://huggingface.co/docs/evaluate/a_quick_tour) for more information) function:
104104

105105
```py
106106
>>> import numpy as np
107-
>>> from datasets import load_metric
107+
>>> import evaluate
108108

109-
>>> metric = load_metric("accuracy")
109+
>>> metric = evaluate.load("accuracy")
110110
```
111111

112-
Call `compute` on `metric` to calculate the accuracy of your predictions. Before passing your predictions to `compute`, you need to convert the predictions to logits (remember all 🤗 Transformers models return logits):
112+
Call [`~evaluate.compute`] on `metric` to calculate the accuracy of your predictions. Before passing your predictions to `compute`, you need to convert the predictions to logits (remember all 🤗 Transformers models return logits):
113113

114114
```py
115115
>>> def compute_metrics(eval_pred):
@@ -341,12 +341,14 @@ To keep track of your training progress, use the [tqdm](https://tqdm.github.io/)
341341
... progress_bar.update(1)
342342
```
343343

344-
### Metrics
344+
### Evaluate
345345

346-
Just like how you need to add an evaluation function to [`Trainer`], you need to do the same when you write your own training loop. But instead of calculating and reporting the metric at the end of each epoch, this time you will accumulate all the batches with [`add_batch`](https://huggingface.co/docs/datasets/package_reference/main_classes.html?highlight=add_batch#datasets.Metric.add_batch) and calculate the metric at the very end.
346+
Just like how you added an evaluation function to [`Trainer`], you need to do the same when you write your own training loop. But instead of calculating and reporting the metric at the end of each epoch, this time you'll accumulate all the batches with [`~evaluate.add_batch`] and calculate the metric at the very end.
347347

348348
```py
349-
>>> metric = load_metric("accuracy")
349+
>>> import evaluate
350+
351+
>>> metric = evaluate.load("accuracy")
350352
>>> model.eval()
351353
>>> for batch in eval_dataloader:
352354
... batch = {k: v.to(device) for k, v in batch.items()}

0 commit comments

Comments
 (0)