Add emulation for CPU without FMA instruction set#59
Add emulation for CPU without FMA instruction set#59lnuic wants to merge 1 commit intomitsuba-renderer:masterfrom
Conversation
|
fmul+fadd have different rounding behavior, which could lead to subtle platform-dependent inconsistencies. I am thinking that it might be easier to just error out in this case. For example, our pip wheels depend on Haswell IIRC. So even if the LLVM codegen is adjusted, they will still generate invalid instruction failures on older hardware. |
|
I see. I double-checked, the pip wheels target Ivy Bridge currently. So, we can either error out (and bump the pip wheel architecture) or have a fix for Ivy Bridge specifically. I'm fine with dropping Ivy Bridge support. |
|
Thank you, @njroussel, for bringing the existence of fmuladd to my attention. I conducted experiments with it, and it appears to work effectively and code can be simplified that way. The challenges related to precision and architecture support remain. |
|
Follows-up here: #60 |
Description
FMA instruction set was introduced in 2012 (AMD Piledriver, Intel Haswell), but architectures before that already had AVX and SSE4.2 instruction sets. Dr. Jit doesn't currently verify whether a CPU supports the FMA instruction set. As a result, LLVM generates a global offset table when FMA is not available, which leads to a critical compiler failure in Dr. Jit.
To address this issue, emulation of the FMA instruction was implemented using the existing fmul and fadd instructions. This will ensure that the code runs smoothly on CPUs that do not have native FMA support, preventing any potential failures caused by the absence of the instruction set.
This PR should also fix: mitsuba-renderer/drjit/#46
Code to reproduce:
Error message: