Skip to content

SPSA improvements [RFC] #535

@ppigazzini

Description

@ppigazzini

Issue opened to collect info about possible future SPSA improvements.

SPSA references

SPSA is a fairly simple algorithm to be used for local optimization (not global optimization).
The wiki has now a simple documentation to explain the SPSA implementation in fishtest
Here is other documentation:

SPSA implementation problems/improvements

  • we ask for "c_k_end" and "r_k_end" (final parameters values), but IMO we should ask for for "c" and "r" (starting values) if those are too big the SPSA diverges
  • we use "r_k = a_k / c_k^2" instead of "r_k = a_k" (I searched unsuccessfully a reference in some SPSA papers)
  • we set "c_k_end" and "r_k_end" for any single variable to be optimized (the original SPSA uses global values): this makes sense to account for the different sensitivity of the variables, but IMO this should be dealt with an internal normalization of the variables values based upon the starting values and the bounds.
  • one iteration should be set to a 2 games for match, but our worker code cannot support this, so we set one iteration to a 2*N_cores games for match
  • compute an averaged SP gradient per iteration to lower the noise
  • we have experimental code (special rounding and clipping) that nobody use: I'm afraid that it's theoretically correct but not very useful for the rough way we use SPSA
  • "A" parameter should be computed from the number of games
  • the worker passes rounded values to cutechess-cli: we should normalize the variables values to have the same resolution for all the variables

SPSA testing process (aka Time Control)


EDIT_000
this paragraph is outdated, I kept it to avoid disrupting the chain of posts:

  • read the wiki for a SPSA description https://github.com/glinscott/fishtest/wiki/Creating-my-first-test#tuning-with-spsa
  • the experience on these last years has shown that a very short time control on fishtest is not working:
    • with NNUE, workers running on dual CPU have time losses at ultra short time control (USTC)
    • that SPSA using or LTC or even ULTC has a high Signal/Noise ratio that helps the convergence. A ULTC match is very drawish, so in SPSA one side will win a pair of games only if the parameters random increments are somehow aligned with the gradient direction

I suggest this process to optimize the developer time and the framework CPU.

  • first steps: run some SPSAs at Ultra STC (e.g. 1+0.01) to find good "c_k_end", "r_k_end" values and some good variables starting values. This can be done or locally with a recent CPU or in fishtest.
  • last step: run a final SPSA in fishtest to optimize the variables for a longer TC (e.g. STC, 20+0.2, LTC etc.)

I took a SPSA from fishtest and run it locally changing only the the TC, the results are similar:

20+02

  • 2+0.02:

2+002

  • 1+0.01::

1+001

  • 0.5+0.01:

05+001

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions