Skip to content

Pymbar 4 compatibility#261

Closed
mattwthompson wants to merge 3 commits intoleeping:masterfrom
mattwthompson:pymbar-4
Closed

Pymbar 4 compatibility#261
mattwthompson wants to merge 3 commits intoleeping:masterfrom
mattwthompson:pymbar-4

Conversation

@mattwthompson
Copy link
Copy Markdown
Contributor

Splitting out some of #259 to tackle these updates one at a time

src/liquid.py Outdated
Comment on lines +938 to +969
mmbar = pymbar.MBAR(mU_kln, mN_k, verbose=False, relative_tolerance=5.0e-8, method='self-consistent-iteration')
mW1 = mmbar.getWeights()
mmbar = pymbar.MBAR(mU_kln, mN_k, verbose=False, relative_tolerance=5.0e-8)
mW1 = mmbar.weights()
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no idea what impact the selection of this method entails. It's not referenced in migration guide or clearly described elsewhere in the docstrings. It's not actually clear to me what it was doing before; it's not a named argument in the initializer yet I don't see in any logs the warning about it being an unrecognized keyword argument ... this bit in the current version implies to me this is currently the default behavior. I'm not really sure.

A comment a few lines up leads me to believe this isn't very important ... https://github.com/leeping/forcebalance/pull/261/files#diff-60185d7a8e64e95075d00867f23a65af80149090d8f4f2465d1317f0f1177ed6R923

Copy link
Copy Markdown
Contributor Author

@mattwthompson mattwthompson Jul 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing the argument gets different/bad behavior.

I filed a ticket with the pymbar developers: choderalab/pymbar#472

@leeping
Copy link
Copy Markdown
Owner

leeping commented Jul 29, 2022

Hi Matt,

Thanks a lot for your help on this. One of my concerns was that the major use cases for pymbar involved running a large number of long simulations, which could not be covered in the unit tests out of necessity. (About 50 simulations of water boxes at different temperatures with 8 ns each). It might be possible to design a unit test by storing a large 3D matrix of energies extracted from one of these jobs. Do you think I could add this as a test, with the expected result provided by pymbar 2, and make the result a requirement for proceeding with the newer versions?

  • Lee-Ping

@mattwthompson
Copy link
Copy Markdown
Contributor Author

Sure! My objective here is to update the code to work with the new API, and improved tests would help there. If you write up that test that works with version 3 in a separate PR I can incorporate it into here - I think that would be the path with the minimum friction for you?

That the test is running and getting to the assertion step indicates to me that these changes might be correct - but maybe the method/solver protocol is misbehaving because of insufficient simulation data? I'm a bit over my head in really understanding what the existing test does.

@leeping
Copy link
Copy Markdown
Owner

leeping commented Jul 29, 2022

Hi Matt,

I just took a closer look at the tests, and I think that it should be a good enough to determine the correctness of pymbar 4.

The test of pymbar happens in the folder src/tests/files/test_liquid. A single point FB calculation is run in this folder, and it reads stored simulation results from the folder single.tmp (there are 6 different simulations with 80 energies each). The MBAR calculation is based on these simulation results, and the input should be a 6 x 6 x 80 tensor.

In short, I don't think we need to add another test. However, it is worth thinking about whether we could check for regressions in performance, in terms of wall time or number of cycles taken to reach convergence. If I remember correctly, pymbar 3 was slower than pymbar 2 because some C++ helper code was removed, but the slowdown was still acceptable. If the default convergence algorithm is changed in pymbar 4, then the number of iterations required to converge might increase as well. I'd like to avoid a future situation where a future user finds out that pymbar took ~1 hour to complete a step that took minutes several years ago.

  • Lee-Ping

@mattwthompson
Copy link
Copy Markdown
Contributor Author

Hi @leeping,

I agree with your concerns about performance but I think it's something we should tackle in a future PR, for a few reasons:

  • It's not obvious what could be done in the source code here to keep the MBAR-reliant paths performant and in order to report regressions to the pymbar developers we'd probably want a some sort of standalone test or set of benchmarks, not something tied into in the test suite here
  • This at least gets things compatible with the most recent version (assuming the macOS builds don't surprise me) which is an improvement over needing to pin to old versions, even if there is a slowdown

In principle I could add some sort of a check to make sure the water study test takes less than some time to run, but I think that's undesirable because it doesn't really test the code here and is prone to flakiness due to the number of factors (particularly upstream) that can cause performance to go up or down.

@mattwthompson mattwthompson marked this pull request as ready for review August 1, 2022 13:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants