Reading lockfiles if specified in the runcard#818
Conversation
|
Shouldn't we get the rules in the lockfile when doing something like this? pdf: NNPDF31_nlo_as_0118
theoryid: 52
use_cuts: internal
dataset_input:
dataset: ATLASTTBARTOT
actions_:
- plot_fancy |
|
Intuitively |
|
Because that runcard doesn't explicitly override the rules, the rule reading is done by nnpdf/validphys2/src/validphys/filters.py Line 45 in 6c10562 Which isn't written into the lockfile under the current implementation right? |
|
I guess I am saying it should be written in the runcard. The idea is that we can change the rules that are hardcoded in the source (so the "internal" cuts) while having a way to reproduce the runcard as it was when ran. |
|
The thing is if If it's not parsed, then the So the result of and thereby the whole lockfiles tool chain is avoided!! We skip straight to the production rule which doesn't handle any lockfile writing... Honestly, I think the lockfiles mechanism needs some work... It doesn't seem possible to do what we're trying to do with the current implementation. |
8649edb to
fe48936
Compare
|
This needs to be merged for lockfiles to work correctly |
wilsonmr
left a comment
There was a problem hiding this comment.
Other than not recording the defaults if we don't specify a lockfile, looks like it's doing what we want!
| elif default_filter_rules is not None: | ||
| # For when a rules spec is specified | ||
| filter_rules = self.load_default_default_filter_rules(default_filter_rules) | ||
| else: | ||
| filter_rules = default_filter_rules_input() |
There was a problem hiding this comment.
Am I right in thinking that these filter rules in validphys.cuts.filters.yaml are the exact same as in validphys.cuts.lockfiles.31_filters.lock.yaml?
if so I really think this block should be (please double check indents are correct, editing on github is horrible):
| elif default_filter_rules is not None: | |
| # For when a rules spec is specified | |
| filter_rules = self.load_default_default_filter_rules(default_filter_rules) | |
| else: | |
| filter_rules = default_filter_rules_input() | |
| else: | |
| if default_filter_rules is None: | |
| # no defaults file specified, use <whatever> | |
| default_filter_rules = self.parse_default_filter_rules("<whatever>") | |
| # save loaded defaults to lockfile. | |
| filter_rules = self.load_default_default_filter_rules(default_filter_rules) |
If the defaults are not the same then make a legacy defaults lockfile and read that instead. That way if the defaults ever get changed then the lockfile still contains a record of what they were at the time.Also in the future we can change the default to be 40 or whatever we have come up with by then and old lockfiles still remain valid.
There was a problem hiding this comment.
I mean validphys.cuts.filters.yaml at the moment is identical to validphys.cuts.lockfiles.31_filters.lock.yaml, but it need not be in general.
The way I thought it was meant to be was that we have a copy of the 3.1 rules, but the current iteration of the rules is always in validphys.cuts.filters.yaml and we make a new lock file at every release, e.g 4.0 or 4.1 etc...
There was a problem hiding this comment.
Is it? I did a hasty print and they looked slightly different but I didn't bother working out how (the prints were different lengths).
I don't see the advantage to have some defaults loaded outside of the lockfile system when you have implemented the lockfile system. If I run a report where I use cuts internal but don't specify the set of defaults to use then it falls back on this. But those defaults never get saved to the lockfile for that particular report. So in the future if anybody ever edits validphys.cuts.filters.yaml then the lockfile for my report is invalidated.
Also if you use lockfiles to load the fallback option then you can even change what the fallback option is (now it's 3.1 maybe but in the future could be something else) but all lockfiles will still tell you exactly what set of rules were used at the time.
There was a problem hiding this comment.
Oh yes they're actually different because some datasets were being added as we moved towards 4.0.
Okay I'll commit this suggestion, my heads a bit of a spin with the amount of defaults im seeing.
There was a problem hiding this comment.
If validphys.cuts.filters.yaml has been edited whilst also being the fallback default then do you see how that demonstrates why I think it should be a lockfile?
There was a problem hiding this comment.
Yeah makes sense. Where should we specify the fallback default then? Some sort of global variable?
There was a problem hiding this comment.
Also, when would validphys.cuts.filters.yaml ever be used then?
There was a problem hiding this comment.
Ugh this quickly gets messy and confusing and my suggestion above still isn't the right way to do things. I think the best demonstration of how to do the fallback is with the data_grouping mechanism. That's all handled in its own production rule but you can apply the same ordering and logic in this function.
- So first you check if
filter_rulesis None, if not you can use that (which is already done) - Else Check if
default_filter_rulesis None, if it is then the user didn't specify it and we must set it such that the fallback defaults is loaded, I'm never good with names but probably that should be calledcurrentor something better - see the note on this at the end of the next point though - Check if
default_filter_rules_recorded_spec_is None, if not then we are running from a lockfile and we return the defaults recorded at the time. The point is that we need to have a value fordefault_filter_rulesbecause we use:filter_rules = default_filter_rules_recorded_spec_[default_filter_rules]if the lockfile was produced from a fallback then we need to setdefault_filter_rulesto be the fallback key. Also this highlights that the fallback key must never change or else old lockfiles will be broken, in that sense I don't think the value of the fallback default_filter_rules should be global, and perhaps there should be a comment next to it saying don't change this. - Else we load the defaults as per
filter_rules = self.load_default_default_filter_rules(default_filter_rules)which ensures they get saved to the lockfile.
Just to reiterate, the defaults associated with the fallback value of default_filter_rules can be changed (i.e the contents of the yaml file) however the actual value must never change i.e if default_filter_rules falls back to current now then it basically always has to. In the same way as the data grouping must now always fallback to standard_report
A minor comment is that the files from which the defaults are being loaded from i.e validphys.cuts.lockfiles.31_filters.lock.yaml are not lockfiles, they are just the defaults. The lockfiles are where those defaults get saved.
There was a problem hiding this comment.
validphys.cuts.filters.yaml would be moved to validphys.cuts.lockfiles.<whatever the fallback default_filter_rules value is>_filters.lock.yaml and would get loaded whenever default_filter_rules is not specified.
| elif default_filter_settings is not None: | ||
| # If the user requests to read from a pre existing settings lockfile | ||
| filter_defaults = self.load_default_default_filter_settings(default_filter_settings) | ||
| defaults_loaded = True | ||
| elif not filter_defaults: | ||
| filter_defaults = default_filter_settings_input() | ||
| defaults_loaded = True |
There was a problem hiding this comment.
Likewise. Let's keep a record of all the defaults.
validphys2/src/validphys/config.py
Outdated
| try: | ||
| theory_parameters = theoryid.get_description() | ||
| except AttributeError: | ||
| raise ConfigError("Missing theoryid for processing rules") |
There was a problem hiding this comment.
When is this supposed to fire?
There was a problem hiding this comment.
In [1]: from validphys.api import API
In [2]: API.rules(use_cuts="internal")
...
ConfigError: Missing theoryid for processing rulesThere was a problem hiding this comment.
I didn't quite follow on the discussion and don't see why theoryid shouldn't be required rather than optional. Also the test really should be theoryid is None which is way more explicit.
There was a problem hiding this comment.
Either theoryid should be optional or rules should be later on. If I have use_cuts: fromfit or use_cuts: nocuts I shouldn't have to supply a theoryid
There was a problem hiding this comment.
But since this function already returns None when cuts are not interal I thought the easiest change is to just move when we get the theory description and only require it in the case that cuts are internal.
There was a problem hiding this comment.
But it seems much more reasonable to make rules optional wherever it is actually optional, no?
There was a problem hiding this comment.
Perhaps yes, that would be much easier if we had the part of #1057 which improves massively the way dataset and experiment (EDIT: and data) are produced since they both take rules as well as the cuts production rule.
ee39132 to
9105785
Compare
|
(I had started the message by saying that this is all overcomplicated and I am changing it to say I am going to propose something even more complicated now). Mostly thinking aloud so far. First of all we should decide what do we want the semantics to be. Do we want that the "NNPDF40" cuts are completely immutable, so we may end up writing With that in mind, let us say that there are certain keys that are "defaultable". These can involve relatively complex settings (filter rules, default cfactors, variants and the like) or simler settings (the default family for the cuts "NNPDF40"). We would like to not write explicitly those choices to the runcard but have then in the lockfile so we do not need to relt that they stay immutable in the code. E.g. we don't really want to change what you need to write in the runcard every time we find a minor bug in some dataset, yet we want to be able to tell whether a given report was produced with the bug or not (ideally in an automated way, but let`s go step by step). It seems to me a lot of the complication comes from the fact that we now have "keyed" defaults. Indeed Michael point 3 is where I start getting confused. We declare that a defaultable can take no parameters whatsoever conceptually. In the context of filters, let's call that The there is The logic is then: If that key, So in the end we have something like: PS: This is indeed confusing. Writing this comment took me more than an hour and still not sure I got it right. |
|
I think in the example above one can get away without defaults for filter_version because all the specification gets written explicitly. But I had to think about it. |
|
hmm ok so (I think) you're suggesting that the defaults get held in a single mappable and absolutely all of them get dumped to lockfile everytime. Then I think we would no longer require Then I think I agree with everything. My only question (which I was trying to address in 3. with the fallback is when you do
and if if filter_epoch is None:
filter_epoch = "NNPDF4.0" # for nowthen that line would undoubtedly want to be changed in the future but bthat would invalidate the lockfile ran before. i.e consider this: it's the year 2030, NNPDF10.0 has just been released and Professor @siranipour is tired of the defaults still being for "NNPDF4.0" so changes it to if filter_epoch is None:
filter_epoch = "NNPDF10.0" # finally!but then if you run on the lockfile which was produced before those defaults were added to Finally every release you would get the settings for |
|
I was trying to address that with the functions below it, but didn't do it very well. The answer is that |
|
Overall I think we made it a bit too general in reportengine and added a few levels of indirection than then we can't really understand ourselves (admittedly that was always with the hope that those things may be used as building blocks for more obvious approaches). I think is sensible to just declare defaults are stateless (also because anything else leads to headaches when considering the interaction with namespaces). |
I think the first time I did this i forgot to realise that if you specified the lockfile in the runcard then you should load in the rules in the lockfile