Evals

Contribution by Adam Owada

Please see OpenAI's original README.

This eval aimed to test ChatGPT's ability to track its position on a grid after a series of up, down, left, right moves. The Pull Request to OpenAI's repository has been closed, and can be found here.

Description

Although the original Pull Request has been closed, the new aim of this project is to test and quantify the efficacy of different prompting techniques to ChatGPT.

So far I have been able to conclude that adding the instruction "Explain your thought process." to an existing prompt can significantly improve its accuracy. (TODO: Add metrics)

For example, after these 25 steps the correct final position is (-4, 5). The difference in accuracy is shown in these two screenshots:

Original:

Adding "Explain your thought process.":

Name		Name	Last commit message	Last commit date
Latest commit History 195 Commits
.github		.github
assets		assets
docs		docs
evals		evals
examples		examples
scripts		scripts
.gitattributes		.gitattributes
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
Makefile		Makefile
README.md		README.md
SECURITY.md		SECURITY.md
mypy.ini		mypy.ini
openai_README.md		openai_README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Evals

Description

About

Uh oh!

Releases

Packages

Languages

License

adamowada/evals

Folders and files

Latest commit

History

Repository files navigation

Evals

Description

About

Resources

License

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages