-
Notifications
You must be signed in to change notification settings - Fork 184
Improve dynamic module recognition #357
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve dynamic module recognition #357
Conversation
Codecov Report
@@ Coverage Diff @@
## master #357 +/- ##
==========================================
- Coverage 92.95% 92.91% -0.04%
==========================================
Files 2 2
Lines 809 805 -4
Branches 164 164
==========================================
- Hits 752 748 -4
Misses 29 29
Partials 28 28
Continue to review full report at Codecov.
|
|
@henryiii this should hopefully fix the bug you reported. On my machine I get: In [1]: import boost_histogram._core.hist as h
In [2]: h
Out[2]: <module 'boost_histogram._core.hist'>
In [3]: from cloudpickle import loads, dumps
In [4]: loads(dumps(h))
Out[4]: <module 'boost_histogram._core.hist'>
In [5]: assert loads(dumps(h)) is hIf you can, feel free to checkout this PR locally to ensure this fixes your bug. |
|
I'll check this within a few hours, thank you! |
|
Looks good to me, import boost_histogram as bh
import cloudpickle
h = bh.Histogram(bh.axis.Regular(50, 0, 20))
h.fill([1 ,2 , 3, 4, 5])
cloudpickle.dumps(h)works on this branch! |
ogrisel
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quick comment in passing:
ogrisel
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First pass of comments and suggestions for improvements.
|
mmh, I just realised the importability via module attribute lookupassociated semantic importability using module-specific import machinery:associated semantic: This PR actually confuses the two: |
|
Indeed, it would be better to have the code make the distinction between the 2 notions and only check for the one needed where appropriate. |
|
What's the status of this? I could put in a hack in boost-histogram that would get this working, but I'd rather it be fixed properly for everyone. |
fb6d34a to
da60d2d
Compare
|
rebased. |
|
By iterating a bit more and adding some test cases, my current opinion on this PR has a little bit changed. If #354 appeared in the first place, it is because As of now The more I think about this, the more I believe WDYT @ogrisel? FYI: I tried having |
|
The |
|
pinging downstream libraries developers: @suquark @mrocklin @HyukjinKwon @ogrisel - I would like to have your opinion before merging this: This PR proposes to simplify Does the assumption |
|
I personally don't have much experience here. I don't often run into cases
where people try to serialize dynamically generated modules.
…On Mon, May 25, 2020 at 7:17 AM Pierre Glaser ***@***.***> wrote:
pinging downstream libraries developers: @suquark
<https://github.com/suquark> @mrocklin <https://github.com/mrocklin>
@HyukjinKwon <https://github.com/HyukjinKwon> @ogrisel
<https://github.com/ogrisel> - I would like to have your opinion before
merging this:
This PR proposes to simplify module pickling; when cloudpickle is asked
to pickle a module, it checks if this module is importable (and not
dynamically created) by looking for it inside sys.modules, and if so,
cloudpickle pickle the said module by writing a sequence of instruction
summing up to import <module> in the pickle string.
Previously, cloudpickle module's importability check was way more
complicated than simply looking inside sys.modules. But my opinion is
that looking inside sys.modules should be simpler, and more failure-proof.
Does the assumption module in sys.modules.values() <=> module is
importable sound OK to you? Or is there any use case of yours where this
assumption break (for instance knowingly removing an importable module from
sys.modules)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#357 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AACKZTDJUMDQQ2XYXBOLJLDRTJ4YRANCNFSM4LZA5QUA>
.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I agree that just checking sys,modules to known whether a module should be importable or not should be a good enough heuristic.
Code that generates non-importable dynamic modules registered in sys.modules is probably pathological and I don't think we should try to support it.
| return subimport, (obj.__name__,) | ||
| else: | ||
| obj.__dict__.pop('__builtins__', None) | ||
| return dynamic_subimport, (obj.__name__, vars(obj)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could probably be named to _make_dynamic_module as we actually do not import the module at all.
We should still keep dynamic_subimport as an alias for _make_dynamic_module to preserve backward compat with old pickle files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The subimport function could also be renamed to _import_module to be more explicit. But we would also need to keep an alias named subimport pointing to _import_module for backward compat as well.
|
#377 confirms that the failure observed in the distributed tests ("OSError: [Errno 101] Network is unreachable") is not related to the changes in this PR. |
|
@pierreglaser I put a comment on #357 (comment) but as this name was not introduced in this PR, I am fine with not addressing it here. Could you please add an entry to the changelog before merging this? |
ogrisel
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's merge!
This makes sense. I investigated a related issue recently where modules are imported from a source file. If you strictly follow the importlib package doc to import a source file you add the module to |
Fixes #354
To emulate package capabilities while being a single file, extension modules (for instance a
mod.sofile) can dynamically create a module object (most likely using the*package_name*.*parent_module_name*.*submodule_name*naming convention to be compatible withimportsemantics).Internally, they will use the
PyImport_AddModule(submodule_qualified_name), which creates a module object and adds it tosys.package. Such a module is a dynamic module, but is also importable, which shows that, despite whatcloudpickle's logic seems to imply, those are not mutually exclusive attributes.This PR adds support to treat these dynamic modules as importable. We probably should rename
_is_dynamicto_is_importable, as the latter quality is really what matters here.