Skip to content

Commit 944302d

Browse files
authored
add __atomic foreign convention and improve related flonum unboxing (#976)
The `__atomic` convention promises that a foreign function will not call back into Scheme, such as through a foreign callable, before it returns. That promise enables a more direct call to the foreign function. For example, on my machine, calling ``` (foreign-procedure __atomic "(cs)s_errno" () int)) ``` takes about 80% the time of calling ``` (foreign-procedure "(cs)s_errno" () int)) ``` When a foreign function has arguments or results that are represented as flonums, using `__atomic` allows the call to be exposed enough in intermediate compiler passes so that it can partitcapte in flonum-unboxing optimizations. In principle, unboxing optimizations could also apply to non-`__atomic` foreign functions, but the current calling convention for Scheme functions and the protocol between Scheme code and the kernel (especially the garbage collector) make that difficult. Further improvements there are possible, but functions that fit the `__atomic` constraint can be handled relatively easily. Probably `__atomic` is wanted for the best performance, anyway, when unboxing is relevant. Meanwhile, various problems were interfering with flonum unboxing around `ftype-ref` and `ftype-set!` on `double` and `float` fields, and those problems are now fixed. Closes #921 and uses parts of the example there by @williewillus in tests.
1 parent 5aed282 commit 944302d

23 files changed

+432
-170
lines changed

boot/pb/equates.h

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
/* equates.h for Chez Scheme Version 10.3.0-pre-release.4 */
1+
/* equates.h for Chez Scheme Version 10.3.0-pre-release.5 */
22

33
/* Do not edit this file. It is automatically generated and */
44
/* specifically tailored to the version of Chez Scheme named */
@@ -1015,7 +1015,7 @@ typedef uint64_t U64;
10151015
#define rtd_sealed 0x4
10161016
#define sbwp (ptr)0x4E
10171017
#define scaled_shot_1_shot_flag -0x8
1018-
#define scheme_version 0xA030004
1018+
#define scheme_version 0xA030005
10191019
#define seginfo_generation_disp 0x1
10201020
#define seginfo_list_bits_disp 0x8
10211021
#define seginfo_space_disp 0x0

boot/pb/petite.boot

-1.97 KB
Binary file not shown.

boot/pb/scheme.boot

5.6 KB
Binary file not shown.

boot/pb/scheme.h

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
/* scheme.h for Chez Scheme Version 10.3.0-pre-release.4 (pb) */
1+
/* scheme.h for Chez Scheme Version 10.3.0-pre-release.5 (pb) */
22

33
/* Do not edit this file. It is automatically generated and */
44
/* specifically tailored to the version of Chez Scheme named */
@@ -40,7 +40,7 @@
4040
#endif
4141

4242
/* Chez Scheme Version and machine type */
43-
#define VERSION "10.3.0-pre-release.4"
43+
#define VERSION "10.3.0-pre-release.5"
4444
#define MACHINE_TYPE "pb"
4545

4646
/* Integer typedefs */

c/ffi.c

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -680,6 +680,9 @@ static void closure_callback(UNUSED ffi_cif *cif, void *ret, void **args, void *
680680
memcpy(&PBREGS(tc, 0), caller_saved, sizeof(caller_saved));
681681

682682
if (!ret_is_arg) {
683+
/* in case GC moved `types` (but `vec` is immobile) */
684+
types = Svector_ref(vec, 1);
685+
683686
/* move result to "arena" */
684687
type = Svector_ref(types, RET_TYPE_INDEX);
685688

csug/foreign.stex

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -220,7 +220,8 @@ Windows: \scheme{__stdcall}, \scheme{__cdecl}, and \scheme{__com} (32-bit only).
220220
Since \scheme{__cdecl} is the default, specifying \scheme{__cdecl} is
221221
equivalent to specifying \scheme{#f} or no convention.
222222
Additionally, \var{conv} can be \scheme{__collect_safe} to indicate that garbage
223-
collection is allowed concurrently with a call of the foreign procedure, or it
223+
collection is allowed concurrently with a call of the foreign procedure, it can be
224+
\scheme{__atomic} to indicate that the foreign function never calls back to Scheme, or it
224225
can be \scheme{__varargs} or \scheme{(__varargs_after \var{n})} to indicate
225226
that the procedure uses a convention that works with a variable number of arguments
226227
after the first \var{n} (which may differ from the convention used for the
@@ -285,6 +286,11 @@ If a foreign procedure that is called with \scheme{__collect_safe} can
285286
invoke callables, then each callable should also be declared with
286287
\scheme{__collect_safe} so that the callable reactivates the thread.
287288

289+
Use \scheme{__atomic} to declare that the foreign procedure never
290+
calls back to Scheme, such as through a foreign callable, which may
291+
allow a more efficient implementation of calls to the foreign
292+
function. The \scheme{__atomic} and \scheme{__collect_safe}
293+
declarations cannot be used at the same time.
288294

289295
Complete type checking and conversion is performed on the parameters
290296
to a foreign procedure.
@@ -1171,7 +1177,7 @@ correct.
11711177

11721178
Each \var{conv} adjusts the calling convention to be used.
11731179
\scheme{foreign-callable} supports the same conventions as
1174-
\scheme{foreign-procedure} with the exception of \scheme{__com}.
1180+
\scheme{foreign-procedure} except for \scheme{__com} and \scheme{__atomic}.
11751181
The \scheme{__collect_safe} convention for a callable activates a
11761182
calling thread if the thread is not already activated, and
11771183
the thread's activation state is reverted when the callable

mats/fl.ms

Lines changed: 51 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1147,23 +1147,24 @@
11471147
(begin
11481148
(define-syntax check-loop-allocation
11491149
(syntax-rules ()
1150-
[(_ proc) ; proc should allocate only its result flonum
1150+
[(_ proc val-example) ; proc should allocate only its result flonum
11511151
(or (eq? (current-eval) interpret)
11521152
(#%$suppress-primitive-inlining)
11531153
(let ([before (+ (bytes-allocated) (bytes-deallocated))]
11541154
[N 100000])
11551155
(and
11561156
(box?
1157-
(let loop ([i N] [bx (box 0.0)])
1157+
(let loop ([i N] [bx (box val-example)])
11581158
(if (zero? i)
11591159
bx
11601160
(loop (sub1 i) (let ([v (unbox bx)])
11611161
(box (proc v)))))))
11621162
(let ([allocated (- (+ (bytes-allocated) (bytes-deallocated)) before)]
1163-
[expected (* N (+ (compute-size 1.0)
1163+
[expected (* N (+ (compute-size val-example)
11641164
(compute-size (box #f))))])
11651165
(printf "~s ~s\n" allocated expected)
1166-
(<= expected allocated (* 1.2 expected))))))]))
1166+
(<= expected allocated (* 1.2 expected))))))]
1167+
[(_ proc) (check-loop-allocation proc 1.0)]))
11671168
#t)
11681169

11691170
(check-loop-allocation (lambda (v) (fl+ v v)))
@@ -1239,6 +1240,13 @@
12391240
(fl+ v 1.0)
12401241
(fl- v 1.0))))
12411242

1243+
(or (and (not (enable-cp0))
1244+
(not (= 3 (optimize-level))))
1245+
(check-loop-allocation (lambda (v)
1246+
(fl-make-rectangular (fl- (cfl-real-part v) 1.0)
1247+
(fl+ 1.0 (cfl-imag-part v))))
1248+
1.0-3.0i))
1249+
12421250
(check-loop-allocation (lambda (v)
12431251
;; The two single-argument `fl+`s here should work as
12441252
;; a hint for unboxing in the loop
@@ -1275,6 +1283,45 @@
12751283
(set-pseudo-random-generator-x12! s v)
12761284
(pseudo-random-generator-x20 s)))))))
12771285

1286+
(let ()
1287+
(define-ftype V2d
1288+
(struct (x double) (y double)))
1289+
(define v2d (make-ftype-pointer V2d (foreign-alloc (ftype-sizeof V2d))))
1290+
(check-loop-allocation (lambda (v)
1291+
(fl+ (ftype-ref V2d (x) v2d)
1292+
(ftype-ref V2d (y) v2d)))))
1293+
1294+
(let ()
1295+
(define-ftype V2d
1296+
(struct (x double) (y double)))
1297+
(define v2d (make-ftype-pointer V2d (foreign-alloc (ftype-sizeof V2d))))
1298+
(check-loop-allocation (lambda (v)
1299+
(ftype-set! V2d (x) v2d v)
1300+
(ftype-set! V2d (y) v2d v)
1301+
(fl- v))))
1302+
1303+
(let ()
1304+
(define-ftype V2f
1305+
(struct (x float) (y float)))
1306+
(define v2f (make-ftype-pointer V2f (foreign-alloc (ftype-sizeof V2f))))
1307+
(check-loop-allocation (lambda (v)
1308+
(ftype-set! V2f (x) v2f v)
1309+
(ftype-set! V2f (y) v2f v)
1310+
(fl- v))))
1311+
1312+
(let ()
1313+
(define-ftype V2f
1314+
(struct (x float) (y float)))
1315+
(define v2f (make-ftype-pointer V2f (foreign-alloc (ftype-sizeof V2f))))
1316+
(check-loop-allocation (lambda (v)
1317+
(fl+ (ftype-ref V2f (x) v2f)
1318+
(ftype-ref V2f (y) v2f)))))
1319+
1320+
(or (not (enable-cp0))
1321+
(let ([my-flsin (foreign-procedure __atomic "(cs)sin" (double) double)])
1322+
(check-loop-allocation (lambda (v)
1323+
(my-flsin (my-flsin v))))))
1324+
12781325
(begin
12791326
(define many-compare
12801327
(lambda (a b c d e f g h i j k)

mats/foreign.ms

Lines changed: 78 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -179,6 +179,8 @@
179179
ls)])
180180
(syntax (= (let ([x (foreign-procedure name (p ...) double-float)])
181181
(x v ...))
182+
(let ([x (foreign-procedure __atomic name (p ...) double-float)])
183+
(x v ...))
182184
(+ v ...)))))))))
183185

184186
(define foreign1.so (let ([p (find-source "foreign1.so")])
@@ -322,12 +324,23 @@
322324
(error? (begin (foreign-procedure 'foo () scheme-object) 'q))
323325
(error? (if (foreign-procedure 'foo () scheme-object) 'q 'q))
324326

327+
(error? (lambda () (foreign-procedure __bad_conv "foo" () scheme-object)))
328+
(error? (lambda () (foreign-procedure __atomic __collect_safe "foo" () scheme-object)))
329+
325330
(eq? 'foo ((foreign-procedure "idiptr" (scheme-object) scheme-object) 'foo))
326331

332+
(procedure? (foreign-procedure __atomic "idiptr" (scheme-object) scheme-object))
333+
(error? (foreign-procedure __atomic "i do not exist" (scheme-object) scheme-object))
334+
(eq? 'foo ((foreign-procedure __atomic "idiptr" (scheme-object) scheme-object) 'foo))
335+
327336
(parameterize ([current-eval interpret])
328337
(eq? 'foo ((foreign-procedure "idiptr" (scheme-object) scheme-object) 'foo)))
329338

330339
(not (eq? 'foo ((foreign-procedure "idiptr" (scheme-object) void) 'foo)))
340+
(not (eq? 'foo ((foreign-procedure __atomic "idiptr" (scheme-object) void) 'foo)))
341+
342+
(parameterize ([current-eval interpret])
343+
(eq? 'foo ((foreign-procedure __atomic "idiptr" (scheme-object) scheme-object) 'foo)))
331344

332345
(begin (define idint32 (foreign-procedure "id" (integer-32) integer-32))
333346
(procedure? idint32))
@@ -348,6 +361,25 @@
348361
(error? (idint32 #f))
349362
(error? (idint32 "hi"))
350363

364+
(begin (define idint32-a (foreign-procedure __atomic "id" (integer-32) integer-32))
365+
(procedure? idint32-a))
366+
(eqv? (idint32-a 0) 0)
367+
(eqv? (idint32-a #x7fffffff) #x7fffffff)
368+
(eqv? (idint32-a -1) -1)
369+
(eqv? (idint32-a #x-7fffffff) #x-7fffffff)
370+
(eqv? (idint32-a #x-80000000) #x-80000000)
371+
(eqv? (idint32-a #x80000000) (+ #x-100000000 #x80000000))
372+
(eqv? (idint32-a #x80000001) (+ #x-100000000 #x80000001))
373+
(eqv? (idint32-a #xffffffff) (+ #x-100000000 #xffffffff))
374+
(error? (idint32-a #x100000000))
375+
(error? (idint32-a #x100000001))
376+
(error? (idint32-a #xfffffffffffffffffffffffffffff))
377+
(error? (idint32-a #x8000000080000000))
378+
(error? (idint32-a #x-80000001))
379+
(error? (idint32-a #x-8000000080000000))
380+
(error? (idint32-a #f))
381+
(error? (idint32-a "hi"))
382+
351383
(begin (define iduns32 (foreign-procedure "id" (unsigned-32) unsigned-32))
352384
(procedure? iduns32))
353385
(eqv? (iduns32 0) 0)
@@ -372,6 +404,9 @@
372404
(eqv? #xffffffff ((foreign-procedure "id" (integer-32) unsigned-32) -1))
373405
(eqv? -1 ((foreign-procedure "id" (unsigned-32) integer-32) #xffffffff))
374406

407+
(eqv? #xffffffff ((foreign-procedure __atomic "id" (integer-32) unsigned-32) -1))
408+
(eqv? -1 ((foreign-procedure __atomic "id" (unsigned-32) integer-32) #xffffffff))
409+
375410
(begin (define idfix (foreign-procedure "idiptr" (fixnum) fixnum))
376411
(procedure? idfix))
377412
(eqv? 0 (idfix 0))
@@ -395,6 +430,11 @@
395430
(error? (foreign-procedure "id" () chare))
396431
(error? (foreign-procedure "id" (void) char))
397432
433+
(error? (foreign-procedure __atomic "id" (booleen) char))
434+
(error? (foreign-procedure __atomic "id" (integer-32 integer-34) char))
435+
(error? (foreign-procedure __atomic "id" () chare))
436+
(error? (foreign-procedure __atomic "id" (void) char))
437+
398438
((foreign-procedure "id" (boolean) boolean) #t)
399439
(not ((foreign-procedure "id" (boolean) boolean) #f))
400440
((foreign-procedure "id" (boolean) boolean) 0)
@@ -404,6 +444,12 @@
404444
(not ((foreign-procedure "id" (integer-32) boolean) 0))
405445
((foreign-procedure "id" (integer-32) boolean) 1)
406446
447+
(= 1 ((foreign-procedure __atomic "id" (boolean) integer-32) #t))
448+
(= 1 ((foreign-procedure __atomic "id" (boolean) integer-32) 0))
449+
(= 0 ((foreign-procedure __atomic "id" (boolean) integer-32) #f))
450+
(not ((foreign-procedure __atomic "id" (integer-32) boolean) 0))
451+
((foreign-procedure __atomic "id" (integer-32) boolean) 1)
452+
407453
(char=? #\a ((foreign-procedure "id" (char) char) #\a))
408454
(= 0 ((foreign-procedure "id" (char) integer-32) #\nul))
409455
(char=? #\nul ((foreign-procedure "id" (integer-32) char) 0))
@@ -436,6 +482,12 @@
436482
(= 0 ((foreign-procedure (if (windows?) "windows_strcmp" "strcmp") (u8* string) integer-32) bv s)))))
437483
(error? ((foreign-procedure "id" (string) void) 'foo))
438484

485+
(let ([id1 (foreign-procedure __atomic "idiptr" (string) string)]
486+
[id2 (foreign-procedure __atomic "idiptr" (string) iptr)]
487+
[id3 (foreign-procedure __atomic "idiptr" (iptr) string)])
488+
(and (eq? (id1 #f) #f) (eq? (id2 #f) 0) (eq? (id3 0) #f)))
489+
(error? ((foreign-procedure __atomic "id" (string) void) 'foo))
490+
439491
(= ((foreign-procedure "idid" (integer-32) integer-32) #xc7c7c7) #xc7c7c7)
440492
(= ((foreign-procedure "ididid" (integer-32) integer-32) #x7c7c7c7c)
441493
#x7c7c7c7c)
@@ -444,6 +496,10 @@
444496
#x80000000)
445497
(= ((foreign-procedure "id" (unsigned-32) integer-32) #x80000000)
446498
#x-80000000)
499+
(= ((foreign-procedure __atomic "id" (unsigned-32) unsigned-32) #x80000000)
500+
#x80000000)
501+
(= ((foreign-procedure __atomic "id" (unsigned-32) integer-32) #x80000000)
502+
#x-80000000)
447503
448504
(error? (foreign-procedure 'abcde (integer-32) integer-32))
449505
(let ([template
@@ -460,6 +516,10 @@
460516
(= 1.1 ((foreign-procedure "float_id" (double-float) double-float) 1.1))
461517
(error? ((foreign-procedure "float_id" (double-float) void) 0))
462518

519+
(= 0.0 ((foreign-procedure __atomic "float_id" (double-float) double-float) 0.0))
520+
(= 1.1 ((foreign-procedure __atomic "float_id" (double-float) double-float) 1.1))
521+
(error? ((foreign-procedure __atomic "float_id" (double-float) void) 0))
522+
463523
(let ([fid (foreign-procedure "float_id" (double-float) double-float)])
464524
(let f ((n 10000))
465525
(or (= n 0)
@@ -475,13 +535,28 @@
475535
integer-32)
476536
29 31 37 41 43 47 49 53 59 61))
477537

538+
(= (+ (* 1 29) (* 2 31) (* 3 37) (* 5 41) (* 7 43)
539+
(* 11 47) (* 13 49) (* 17 53) (* 19 59) (* 23 61))
540+
((foreign-procedure __atomic "testten"
541+
(integer-32 integer-32 integer-32 integer-32 integer-32
542+
integer-32 integer-32 integer-32 integer-32 integer-32)
543+
integer-32)
544+
29 31 37 41 43 47 49 53 59 61))
545+
478546
(= (+ 1.1 2.2 3.3 4.4 5.5 6.6 7.7 8.8)
479547
((foreign-procedure "flsum8"
480548
(double-float double-float double-float double-float
481549
double-float double-float double-float double-float)
482550
double-float)
483551
1.1 2.2 3.3 4.4 5.5 6.6 7.7 8.8))
484552

553+
(= (+ 1.1 2.2 3.3 4.4 5.5 6.6 7.7 8.8)
554+
((foreign-procedure __atomic "flsum8"
555+
(double-float double-float double-float double-float
556+
double-float double-float double-float double-float)
557+
double-float)
558+
1.1 2.2 3.3 4.4 5.5 6.6 7.7 8.8))
559+
485560
(= (+ 1 2 3 4 5 6.75 7 8.5)
486561
((foreign-procedure "sparcfltest"
487562
(integer-32 integer-32 integer-32 integer-32
@@ -2720,7 +2795,7 @@
27202795
x)
27212796
43)
27222797
(equal?
2723-
(let ()
2798+
(let ([keeps '()])
27242799
; foreign_callable example adapted from foreign.stex
27252800
(define cb-init
27262801
(foreign-procedure "cb_init" () void))
@@ -2732,6 +2807,7 @@
27322807
(define callback
27332808
(lambda (p)
27342809
(let ([code (foreign-callable p (char) void)])
2810+
(set! keeps (cons code keeps))
27352811
(foreign-callable-entry-point code))))
27362812
(let ()
27372813
(define ouch
@@ -2750,6 +2826,7 @@
27502826

27512827
(parameterize ([current-output-port (open-output-string)])
27522828
(event-loop "abcde")
2829+
(keep-live keeps)
27532830
(get-output-string (current-output-port)))))
27542831
(format "Ouch! Hit by 'a'~%Rats! Received 'c'~%Ouch! Hit by 'e'~%"))
27552832
; make sure foreign-procedure's code-object is properly locked when

0 commit comments

Comments
 (0)