NFQ #897

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Merged

HenriDeh merged 28 commits into JuliaReinforcementLearning:main from CasBex:NFQ

Jun 26, 2023

Contributor

CasBex commented Jun 2, 2023 •

edited

Loading

Fixes #895

PR Checklist

Refactor old NFQ code
Categorise NFQ
Verify NFQ with example
Code review

CasBex added 2 commits

June 2, 2023 15:18


          NFQ before refactor

5e6f09d


          NFQ after refactor

c423048

Contributor Author

CasBex commented Jun 2, 2023 •

edited

Loading

5e6f09d was functional in a separate julia project where ReinforcementLearning.jl was simply imported (for reference). Refactor c423048 is untested.

NFQ is meant to be used together with QBasedPolicy. During an episode, action selection happens exactly the same as in any Q-based policy, with the sole difference that the optimise! step should never happen during an episode, only after each episode / at the start of a new episode.

I saw no test code anywhere in the repo. Am I supposed to add tests to the repo or keep them separate? Is there an example somewhere that uses the refactored library?
I don't know what the appropriate location is since it's not quite a DQN algorithm but also not fully offline, feel free to move the files as appropriate.

PS: I am not very familiar with Flux so any comments on that side are welcome as well

CasBex marked this pull request as draft

June 2, 2023 13:43

CasBex changed the title ~~Draft: NFQ~~ NFQ

jeremiahpslewis reviewed

View reviewed changes

src/ReinforcementLearningZoo/src/algorithms/offline_rl/NFQ.jl Outdated Show resolved Hide resolved

jeremiahpslewis reviewed

View reviewed changes

src/ReinforcementLearningZoo/src/algorithms/offline_rl/NFQ.jl Outdated Show resolved Hide resolved

codecov bot commented Jun 2, 2023 •

edited

Loading

Codecov Report

Merging #897 (c43f37a) into main (6de371f) will decrease coverage by 24.36%.
The diff coverage is 0.00%.

@@            Coverage Diff             @@
##             main    #897       +/-   ##
==========================================
- Coverage   24.36%   0.01%   -24.36%     
==========================================
  Files         219     209       -10     
  Lines        7711    7412      -299     
==========================================
- Hits         1879       1     -1878     
- Misses       5832    7411     +1579

Impacted Files	Coverage Δ
...xperiments/experiments/DQN/JuliaRL_NFQ_CartPole.jl	`0.00% <0.00%> (ø)`
...xperiments/src/ReinforcementLearningExperiments.jl	`0.00% <ø> (-100.00%)`	⬇️
...einforcementLearningZoo/src/algorithms/dqns/NFQ.jl	`0.00% <0.00%> (ø)`

... and 63 files with indirect coverage changes

Member

HenriDeh commented Jun 3, 2023

I saw no test code anywhere in the repo. Am I supposed to add tests to the repo or keep them separate? Is there an example somewhere that uses the refactored library?

Algorithms are tested by implementing a functioning example experiment in the RLExperiments package. This is a great way to document your algorithm also as people can see how you build a working agent with the correct trajectory. If you create some functions such as a loss, you can also add tests to see if they behave as expected.

I don't know what the appropriate location is since it's not quite a DQN algorithm but also not fully offline, feel free to move the files as appropriate.

I think the dqns directory is fine.

I am not very familiar with Flux so any comments on that side are welcome as well

I can review your implementation, just ping me when it is ready to be, or if you have a question.

CasBex added 4 commits

June 7, 2023 17:49


          Move to dqns

c88559e


          Refactor

1560ad1


          Add NFQ to RLZoo

02ab01b


          Set up experiment

093f1f4

CasBex commented

View reviewed changes

src/ReinforcementLearningZoo/src/algorithms/dqns/NFQ.jl Outdated Show resolved Hide resolved

CasBex and others added 7 commits

June 12, 2023 11:12


          Merge remote-tracking branch 'origin/main' into NFQ

7402cb2


          Update algorithm for refactor

c1e49da


          rng and loss type

0edd287


          remove duplicate

8461c15


          dispatch on trajectory

b89f67e


          optimise is dummy by default

19e0a97


          optimise! is dispatched on traj and loops it

98444e4

HenriDeh mentioned this pull request

Fix dispatch and update documentation #907

Merged

6 tasks

CasBex added 3 commits

June 16, 2023 14:59


          Fix precompilation warnings

6be2450


          Avoid running post episode optimise! multiple times

2ed5ffb


          Tune experiment

da384b2

CasBex marked this pull request as ready for review

June 16, 2023 13:32

Contributor Author

CasBex commented Jun 16, 2023

As far as I'm concerned, this PR is ready for review. The results of the test are attached.

The provided example performs 500 random steps and then stops exploration. Contrary to the setup in the original paper [1], the example retrains NFQ at the end of every episode.

I noticed while tuning the example that NFQ is somewhat sensitive to the batch size. If the batch size is too small performance oscillates (but some episodes NFQ still obtains the maximum score). I think that could be caused by having 'bad luck' with the sample since one optimise! call can drastically change NFQ behavior. Larger batch size results in more consistent performance, so I chose the batch size equal to the amount of steps.

[1] Riedmiller, M. (2005). Neural Fitted Q Iteration – First Experiences with a Data Efficient Neural Reinforcement Learning Method. In: Gama, J., Camacho, R., Brazdil, P.B., Jorge, A.M., Torgo, L. (eds) Machine Learning: ECML 2005. ECML 2005. Lecture Notes in Computer Science(), vol 3720. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11564096_32


          Merge branch 'main' into NFQ

b30a08e

HenriDeh reviewed

View reviewed changes

src/ReinforcementLearningZoo/src/algorithms/dqns/NFQ.jl Outdated Show resolved Hide resolved

src/ReinforcementLearningZoo/src/algorithms/dqns/NFQ.jl Outdated Show resolved Hide resolved

src/ReinforcementLearningZoo/src/algorithms/dqns/NFQ.jl Outdated Show resolved Hide resolved

src/ReinforcementLearningExperiments/deps/experiments/experiments/DQN/JuliaRL_NFQ_CartPole.jl Outdated Show resolved Hide resolved

src/ReinforcementLearningZoo/src/algorithms/dqns/NFQ.jl Show resolved Hide resolved


          Remove commented code

5ab7a1c

CasBex and others added 2 commits

June 19, 2023 09:39


          Drop gpu call

afc21b6

Co-authored-by: Henri Dehaybe <[email protected]>


          Use sample to get batch from trajectory

033dcdf

CasBex commented

View reviewed changes

src/ReinforcementLearningCore/src/policies/q_based_policy.jl Outdated Show resolved Hide resolved

src/ReinforcementLearningZoo/src/algorithms/dqns/NFQ.jl Outdated Show resolved Hide resolved

CasBex and others added 4 commits

June 19, 2023 13:38


          optimise! for AbstractLearner

31b55b5


          Merge remote-tracking branch 'origin/main' into NFQ

8605dbf


          NFQ optimise! calls at the correct time

b53c96b


          Merge branch 'main' into NFQ

e949807

HenriDeh reviewed

View reviewed changes

Member

HenriDeh left a comment

Just that last little change then we're good to merge I think. Thanks for the PR, it helped us clean up a bit the learners shenanigans.

src/ReinforcementLearningZoo/src/algorithms/dqns/NFQ.jl Outdated Show resolved Hide resolved

Contributor Author

CasBex commented Jun 23, 2023

I found that the NFQ docs were also a bit outdated, so I'm updating that as well. Give me some time to get this in order because my git history is a bit messed up

CasBex added 3 commits

June 23, 2023 11:51


          Remove superfluous function due to main merge

f77e198


          Anonymous loop variable

66ea89b


          Update NFQ docs

37be2a6

Contributor Author

CasBex commented Jun 23, 2023

@HenriDeh It should be done now, thanks for helping me with this

HenriDeh previously approved these changes

View reviewed changes

Member

HenriDeh commented Jun 26, 2023

I'm letting you merge in case you have a last minute tweak to do.

Contributor Author

CasBex commented Jun 26, 2023

I don't have anything more to add, this branch is ready. I don't have write permissions to main, so I'll let you do the honors


          Update julia_words.txt

c43f37a

HenriDeh dismissed their stale review via

c43f37a

June 26, 2023 12:46

HenriDeh self-requested a review

June 26, 2023 12:46

HenriDeh approved these changes

View reviewed changes

HenriDeh enabled auto-merge (squash)

June 26, 2023 13:12

HenriDeh merged commit 72d6766 into JuliaReinforcementLearning:main

CasBex mentioned this pull request

Nfq refactor #980

Merged

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet