Implement specialized Hurdle distribution#7810
Conversation
77ed668 to
9c65ca1
Compare
3d5772c to
bdb3f12
Compare
ac73b55 to
9a65487
Compare
9a65487 to
0917343
Compare
| dist | ||
| for dist in dists | ||
| if ( | ||
| getattr(dist, "rv_type", None) is not None |
There was a problem hiding this comment.
This was too restrictive, a subclass also inherits the dispatch function, and need not be in the registry explicitly
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #7810 +/- ##
=======================================
Coverage 92.88% 92.88%
=======================================
Files 107 107
Lines 18377 18389 +12
=======================================
+ Hits 17069 17081 +12
Misses 1308 1308
🚀 New features to boost your workflow:
|
@ricardoV94 This is amazing. Thank you so much for this! |
0917343 to
3f7efa1
Compare
zaxtax
left a comment
There was a problem hiding this comment.
LGTM! We should probably add a test similar to one that motivate this PR in the first class
We have the pre-existing hurdlesl tests, in a sense this is just a refactor/optimization. Can't think of anything |
What caused the error originally reported here? pymc-devs/nutpie#163 Does that have a test already? |
That was fixed sometime ago in PyTensor: pymc-devs/pytensor#1137 The performance question when in numba is addressed by pymc-devs/pytensor#1445 Neither is PyMC specific |
Feel free to merge whenever you feel comfortable! I think it's good to go |
lucianopaz
left a comment
There was a problem hiding this comment.
Looks good to me. I just left two very minor comments
| ) | ||
|
|
||
| return mix_logp | ||
| mix_support_point = pt.sum(weights * support_point_components, axis=mix_axis) |
There was a problem hiding this comment.
Why not use the logsumexp and have log scale weights here? Is it because the weights are already in the 0-1 range and taking the log won’t help with precision?
There was a problem hiding this comment.
We're not computing any log quantities nor starting with any log quantities so I don't think it would help. Also the initial point is not so critical?
3944eb3 to
3e0ab8c
Compare
This does not require the arbitrary truncation of continuous distribution in the logp/logcdf
3e0ab8c to
1e38719
Compare
It indirectly addresses the issue reported in in pymc-devs/nutpie#163
The new objects have a logp that handles the discrete + continuous process correctly, without requiring the arbitrary truncation of the latter at epsilon. This provides a cheaper and more stable logp / logcdf.
For discrete variables we keep using a truncation
Also added special logic to truncate a Hurdle distribution which solves bambinos/bambi#768, this is not the desired behavior, reverted itCC @zwelitunyiswa
📚 Documentation preview 📚: https://pymc--7810.org.readthedocs.build/en/7810/