Checklist
Describe the issue
Open3d pybind11 module installed exception translator crashes when others pybind11 modules
loaded together with open3d and other's module exception arrives to the open3d exception
translator first.
Ughh, it's complicated, let's see on an example below.
Consider the simple example with two pybind11 modules loaded (open3d and gtsam) that leads to a crash on Mac M1 system:
import gtsam
import open3d
vals = gtsam.Values()
vals.insert(1, 1)
vals.insert(2, 2)
for v in vals.keys():
print("v = ", v)
Crashes with an error:
libc++abi: terminating with uncaught exception of type pybind11::stop_iteration:
To fix the thing, one need to load open3d first and then gtsam, i.e.:
import open3d
import gtsam
...
What's going on here!?
I've spent some time in lldb with my other custom pybind11 module that I've had debug symbols for and it lead me to the chains of exception translators that pybind11 establishes when new module is loaded.
https://github.com/pybind/pybind11/blob/master/include/pybind11/detail/internals.h#L460
Then when pybind11 exception is fired in any other native module it goes to the chain and tries exception translators in a reverse order of their loading.
https://github.com/pybind/pybind11/blob/master/include/pybind11/pybind11.h#L1000
So in an above crash situation open3d exception translators appears to be first and instead of silently passing through and re-throw exception to give a chance for other translators to handle it the process crashes with an above error.
I don't have a deep knowledge of C++ runtimes but seems that some weird translation of original exception happened so it passed through the catch clause of py::detail::apply_exception_translators() function and popped unhandled to a further runtime.
With pybind11 2.8.0+ one can install local_exception_translators that are handled always first for a local module, and with it we've been able to hack around the open3d translator crash on Mac M1, but it will not work for other modules that already built and can't be used with open3d if open3d modules loaded the last one.
UPDATE: Btw, it's not happening in Linux, Win, Mac x64 based systems, the only case where it's that weird behavior is Mac M1 system. Thanks!
Steps to reproduce the bug
import gtsam
import open3d
vals = gtsam.Values()
vals.insert(1, 1)
vals.insert(2, 2)
for v in vals.keys():
print("v = ", v)
Error message
libc++abi: terminating with uncaught exception of type pybind11::stop_iteration:
Expected behavior
no crash
Open3D, Python and System information
- Operating system: macOS 11.6 (20G165)
- Python version: Python 3.9.9 (main, Nov 21 2021, 03:16:13), [Clang 13.0.0 (clang-1300.0.29.3)]
- Open3D version: 0.14.1 (installed into fresh venv with pip)
- System architecture: arm64
- Is this a remote workstation?: no
- How did you install Open3D?: pip
- Compiler version (if built from source): (was uased to debug the issue with `lldb` and my other code)
cc --version
Apple clang version 12.0.5 (clang-1205.0.22.9)
Target: arm64-apple-darwin20.6.0
Thread model: posix
Additional information
the same situation is not hapening when we switch open3d to some other pybind11 based module (for example pip install ouster-sdk and then import ouster.client as client is playing nicely with gtsam and all exception translators are passing through/re-throwing exceptions without abi crashes)
Checklist
masterbranch).Describe the issue
Open3d pybind11 module installed exception translator crashes when others pybind11 modules
loaded together with
open3dand other's module exception arrives to the open3d exceptiontranslator first.
Ughh, it's complicated, let's see on an example below.
Consider the simple example with two pybind11 modules loaded (
open3dandgtsam) that leads to a crash on Mac M1 system:Crashes with an error:
To fix the thing, one need to load
open3dfirst and thengtsam, i.e.:What's going on here!?
I've spent some time in
lldbwith my other custom pybind11 module that I've had debug symbols for and it lead me to the chains of exception translators that pybind11 establishes when new module is loaded.https://github.com/pybind/pybind11/blob/master/include/pybind11/detail/internals.h#L460
Then when pybind11 exception is fired in any other native module it goes to the chain and tries exception translators in a reverse order of their loading.
https://github.com/pybind/pybind11/blob/master/include/pybind11/pybind11.h#L1000
So in an above crash situation
open3dexception translators appears to be first and instead of silently passing through and re-throw exception to give a chance for other translators to handle it the process crashes with an above error.I don't have a deep knowledge of C++ runtimes but seems that some weird translation of original exception happened so it passed through the catch clause of
py::detail::apply_exception_translators()function and popped unhandled to a further runtime.With pybind11 2.8.0+ one can install
local_exception_translatorsthat are handled always first for a local module, and with it we've been able to hack around theopen3dtranslator crash on Mac M1, but it will not work for other modules that already built and can't be used with open3d if open3d modules loaded the last one.UPDATE: Btw, it's not happening in Linux, Win, Mac x64 based systems, the only case where it's that weird behavior is Mac M1 system. Thanks!
Steps to reproduce the bug
Error message
libc++abi: terminating with uncaught exception of type pybind11::stop_iteration:
Expected behavior
no crash
Open3D, Python and System information
Additional information
the same situation is not hapening when we switch
open3dto some other pybind11 based module (for examplepip install ouster-sdkand thenimport ouster.client as clientis playing nicely withgtsamand all exception translators are passing through/re-throwing exceptions without abi crashes)