Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
55 changes: 30 additions & 25 deletions metrics/bertscore/bertscore.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,40 +30,45 @@
"""

_DESCRIPTION = """\
BERTScore leverages the pre-trained contextual embeddings from BERT and matches words in candidate and reference sentences by cosine similarity.
BERTScore leverages the pre-trained contextual embeddings from BERT and matches words in candidate and reference
sentences by cosine similarity.
It has been shown to correlate with human judgment on sentence-level and system-level evaluation.
Moreover, BERTScore computes precision, recall, and F1 measure, which can be useful for evaluating different language generation tasks.
Moreover, BERTScore computes precision, recall, and F1 measure, which can be useful for evaluating different language
generation tasks.

See the [README.md] file at https://github.com/Tiiiger/bert_score for more information.
See the `README.md` file at [https://github.com/Tiiiger/bert_score](https://github.com/Tiiiger/bert_score) for more
information.
"""

_KWARGS_DESCRIPTION = """
BERTScore Metrics with the hashcode from a source against one or more references.

Args:
`predictions` (list of str): prediction/candidate sentences
`references` (list of str or list of list of str): reference sentences
`lang` (str): language of the sentences; required (e.g. 'en')
`model_type` (str): bert specification, default using the suggested
model for the target language; has to specify at least one of
`model_type` or `lang`
`num_layers` (int): the layer of representation to use.
default using the number of layers tuned on WMT16 correlation data
`verbose` (bool): turn on intermediate status update
`idf` (bool or dict): use idf weighting, can also be a precomputed idf_dict
`device` (str): on which the contextual embedding model will be allocated on.
If this argument is None, the model lives on cuda:0 if cuda is available.
`nthreads` (int): number of threads
`batch_size` (int): bert score processing batch size
at least one of `model_type` or `lang`. `lang` needs to be
specified when `rescale_with_baseline` is True.
`rescale_with_baseline` (bool): rescale bertscore with pre-computed baseline
`baseline_path` (str): customized baseline file.
predictions (list of str): Prediction/candidate sentences.
references (list of str or list of list of str): Reference sentences.
lang (str): Language of the sentences; required (e.g. 'en').
model_type (str): Bert specification, default using the suggested
model for the target language; has to specify at least one of
`model_type` or `lang`.
num_layers (int): The layer of representation to use,
default using the number of layers tuned on WMT16 correlation data.
verbose (bool): Turn on intermediate status update.
idf (bool or dict): Use idf weighting; can also be a precomputed idf_dict.
device (str): On which the contextual embedding model will be allocated on.
If this argument is None, the model lives on cuda:0 if cuda is available.
nthreads (int): Number of threads.
batch_size (int): Bert score processing batch size,
at least one of `model_type` or `lang`. `lang` needs to be
specified when `rescale_with_baseline` is True.
rescale_with_baseline (bool): Rescale bertscore with pre-computed baseline.
baseline_path (str): Customized baseline file.

Returns:
'precision': Precision,
'recall': Recall,
'f1', F1 score,
'hashcode': Hashcode of the library,
precision: Precision.
recall: Recall.
f1: F1 score.
hashcode: Hashcode of the library.

Examples:

>>> predictions = ["hello there", "general kenobi"]
Expand Down