Skip to content

Comments

Fix error: cannot allocate memory in static TLS block#4776

Merged
typhoonzero merged 1 commit intoPaddlePaddle:developfrom
helinwang:tag
Oct 13, 2017
Merged

Fix error: cannot allocate memory in static TLS block#4776
typhoonzero merged 1 commit intoPaddlePaddle:developfrom
helinwang:tag

Conversation

@helinwang
Copy link
Contributor

@helinwang helinwang commented Oct 13, 2017

Fixes: #4358 #4775

The error message is:

ImportError: /usr/local/lib/python2.7/dist-packages/py_paddle/_swig_paddle.so:
cannot allocate memory in static TLS block

The error is complaining the thread-local storage (TLS) is not enough. The reason should be glog added a commit that vastly increased the thread-local size. When WITH_GOLANG is ON, glog is compiled twice (once for paddle, once for the go static library), making the tread-local storage exploded.

The fix is we use a release tag of glog with not contain that commit.

Copy link
Contributor

@dzhwinter dzhwinter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
Please merge this fix ASAP.

@typhoonzero typhoonzero merged commit ce91f85 into PaddlePaddle:develop Oct 13, 2017
@dzhwinter
Copy link
Contributor

If the TLS table explode is caused by Glog, many other packages may depend on the original one without the specific tag, which may cause same error in the future easily.

BTW, I find some document says compiler may limit the TLS table size. Is it possible add one compile option?

https://gcc.gnu.org/onlinedocs/gcc/Thread-Local.html#Thread-Local
https://stackoverflow.com/questions/5450694/thread-local-storage-overhead

Finally, thread-local storage is not really meant to store large numbers of variables per thread (there are compiler-dependent limits on the size of the TLS table) but this is something you can easily work around: put many variables into a struct and make a pointer to the struct thread-local.

@helinwang
Copy link
Contributor Author

helinwang commented Oct 13, 2017

From my limited understanding the TLS size is a libc (often glibc) limit (an implementation detail), if we compile non-static binary, we don't have the control over the runtime libc.

@wangkuiyi
Copy link
Collaborator

Thanks for this fix. It is a hacker's work!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ImportError: dlopen: cannot load any more object with static TLS

4 participants