Calculation of mean log probability (GPT-3)

Hello Wenlong,

I think there might be an error in calculating mean log probability when using GPT-3. The main issue is that GPT-3 does not only return generated texts in response, it returns more than these (including `token_logprobs` of `logprobs`). Therefore, in order to calculate the mean log probability, we cannot simply use 
```
# calculate mean log prob across tokens
mean_log_probs = [np.mean(response['choices'][i]['logprobs']['token_logprobs']) for i in range(sampling_params['n'])]
```
Instead, we should stop counting when a stop token is met.

For example, here is a response with a stop sequence of "\n". The generated text is "Walk to kitchen", however GPT-3 returns more than that,
```
response: {
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "logprobs": {
        "text_offset": [
          317,
          322,
          325,
          333,
          333,
          333,
          333,
          333
        ],
        "token_logprobs": [
          -0.2976162,
          -0.00012346054,
          -0.5069456,
          -0.0011470452,
          -0.0060894582,
          -0.00028055036,
          -6.838237e-05,
          -0.054386232
        ],
        "tokens": [
          " Walk",
          " to",
          " kitchen",
          "\n",
          "Step",
          " 2",
          ":",
          " Walk"
        ],
        "top_logprobs": [
          {
            " Get": -3.9821253,
            " Go": -3.5860093,
            " Make": -3.1428235,
            " Wake": -2.513738,
            " Walk": -0.2976162
          },
          {
            " To": -12.335158,
            " in": -11.411637,
            " into": -9.384543,
            " to": -0.00012346054,
            " upstairs": -12.2138815
          },
          {
            " bedroom": -5.3587174,
            " dining": -1.0860167,
            " kitchen": -0.5069456,
            " living": -4.34434,
            " the": -3.2986841
          },
          {
            "\n": -0.0011470452,
            " ": -7.6692185,
            " table": -9.372099,
            ".": -8.122213,
            "ette": -9.167303
          },
          {
            "\n": -5.1904135,
            " Step": -7.8304586,
            "Step": -0.0060894582,
            "Task": -9.905375,
            "step": -10.6300955
          },
          {
            " 1": -10.295448,
            " 2": -0.00028055036,
            " 3": -11.589857,
            " 4": -12.77457,
            "2": -8.387781
          },
          {
            "\n": -11.062581,
            " :": -11.94543,
            ",": -12.268325,
            ".": -10.367215,
            ":": -6.838237e-05
          },
          {
            " Find": -3.783928,
            " Open": -4.0909195,
            " Turn": -5.903181,
            " Walk": -0.054386232,
            "Walk": -5.14835
          }
        ]
      },
      "text": " Walk to kitchen"
    }
  ],
  "model": "text-davinci-001",
  "object": "text_completion",
  "usage": {
    "completion_tokens": 3,
    "prompt_tokens": 94,
    "total_tokens": 97
  }
}
```

The current way of calculating mean log prob gives `-0.10833211608375`, where it should be `mean(-0.2976162, -0.00012346054, -0.5069456) = -0.26822842018`

Please let me know what you think. Great work!

Cheers,
Kaixian

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Calculation of mean log probability (GPT-3) #3

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Calculation of mean log probability (GPT-3) #3

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions