Skip to content
Draft
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 11 additions & 3 deletions test/core/relaxed-simd/i32x4_relaxed_trunc.wast
Original file line number Diff line number Diff line change
Expand Up @@ -50,13 +50,19 @@
(v128.const f32x4 0 -1.0 4294967040.0 4294967296.0))
;; out of range -> saturate or UINT32_MAX
(either (v128.const i32x4 0 0 4294967040 0xffffffff)
(v128.const i32x4 0 0xffffffff 4294967040 0xffffffff)))
(v128.const i32x4 0 0xffffffff 4294967040 0xffffffff)
;; output generated by the algorithm in V8 and SpiderMonkey
(v128.const i32x4 0 0xffffffff 4294967040 0)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't look like this case was covered by #144?


(assert_return (invoke "i32x4.relaxed_trunc_f32x4_u"
(v128.const f32x4 nan -nan nan:0x444444 -nan:0x444444))
;; nans -> 0 or UINT32_MAX
(either (v128.const i32x4 0 0 0 0)
(v128.const i32x4 0xffffffff 0xffffffff 0xffffffff 0xffffffff)))
(v128.const i32x4 0xffffffff 0xffffffff 0xffffffff 0xffffffff)
;; output generated by the algorithm in V8
(v128.const i32x4 0xc0000000 0xc0000000 0xc4444400 0xc4444400)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't covered by #144 either.

;; output generated by the algorithm in SpiderMonkey
(v128.const i32x4 0x80000000 0x80000000 0x80000000 0x80000000)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0x800000 comes from VCVTTPS2UDQ?
Think will be nice to comment here to say which cases lead to which results.
I should have documented it more properly in the first place.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about that but I cannot refer the algorithm/recipe by name, or issue that defines these. I'll just refer SM and V8 as source.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually it is different on V8, it looks like c0000000, c0000000, c4444400, c4444400. SM has 0x8000000s

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


(assert_return (invoke "i32x4.relaxed_trunc_f64x2_s_zero"
(v128.const f64x2 -2147483904.0 2147483904.0))
Expand All @@ -73,7 +79,9 @@
(v128.const f64x2 -1.0 4294967296.0))
;; out of range -> saturate or UINT32_MAX
(either (v128.const i32x4 0 0xffffffff 0 0)
(v128.const i32x4 0xffffffff 0xffffffff 0 0)))
(v128.const i32x4 0xffffffff 0xffffffff 0 0)
;; output generated by the algorithm in V8 and SpiderMonkey
(v128.const i32x4 0xfffffffe 0 0 0)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we get 0xfffffffe? This isn't valid, it should only be UINT32_MAX or saturate (based on #21)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC when I was stepping in the debugger: 0xfffffffe comes from addpd after -1.0 + 4503599627370496.0

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the addpd from implementing this relaxed instruction using the simd128 one? (i32x4.trunc_sat_f64x2_s_zero)

Copy link
Contributor Author

@yurydelendik yurydelendik Mar 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

V8 (turbofan?) generates:

0x9923509f796    16  c4e37909c00b         vroundpd xmm0,xmm0,0xb
0x9923509f79c    1c  49ba90e7fa2001000000 REX.W movq r10,0x120fae790
0x9923509f7a6    26  c4c1795802           vaddpd xmm0,xmm0,[r10]
0x9923509f7ab    2b  c4c178c6c788         vshufps xmm0,xmm0,xmm15,0x88

(notice that liftoff generates trunc_sat_f64x2_s_zero code, not relaxed)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess that implementation comes from WebAssembly/simd#383 "x86/x86-64 processors with AVX instruction set".
That codegen should return a 0. If it doesn't then maybe there's a bug in the generated code i think.

Using the trunc_sat instruction to implement the relaxed instruction is always correct (for all the relaxed instructions). This is so that in deterministic mode, we can fall back to SIMD128 instructions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This discussion seems to conclude that 0xfffffffe is a bug, yet it made it into #144. I'm confused.


(assert_return (invoke "i32x4.relaxed_trunc_f64x2_u_zero"
(v128.const f64x2 nan -nan))
Expand Down