Implement register-based closure ctx by cpunion · Pull Request #1568 · goplus/llgo

cpunion · 2026-01-16T00:13:45Z

Summary

This PR updates LLGo’s closure ABI and calling convention: use a reserved register for ctx when available; otherwise pass ctx as an implicit first parameter with a conditional call; represent closures as pointers to funcval with inline env.

ABI / Representation

Closures are represented as type funcval struct { fn *func; hasCtx uintptr; env ... } and type closure = *funcval; the ABI only sees the pointer.
hasCtx keeps the header size fixed and supports conditional calls on no‑reg targets.
env is inline after the header; plain functions have no env (object size = 2 pointers).
C function pointers are first‑class: a funcval can point directly at a C symbol, so the call site can use the real address without wrapper stubs.

Calling Convention

With a ctx register: write_ctx(env_ptr) then call fn(args...); env_ptr = closure_ptr + 2*ptrSize.
Without a ctx register: conditionally call fn(ctx, args...) vs fn(args...) based on hasCtx.

getClosurePtr

getClosurePtr returns &env[0] (pointer to the first env slot).
On ctx‑register targets it reads the ctx register to get the env base.
On no‑reg targets it uses the explicit ctx parameter as the env base.

Context Register Mapping

GOARCH	Register	Notes
amd64	mm0	use `-msse2` to free MMX
386	mm0	use `-msse2` to free MMX
arm64	x26	reserved via clang target-feature
riscv64	x27	reserved via clang target-feature
wasm	-	conditional ctx param
arm	-	conditional ctx param

Native builds reserve the ctx reg via clang target-feature +reserve-<reg> (arm64/riscv64).
For caller-saved x86, inline asm uses a memory clobber; callee-saved targets do not.

Example IR (closure + C func)

Example Go code:

func cfunc(i int64)

func main() {
  var fn func(i int64)
  fn = cfunc
  fn(0)

  var i int64 = 0
  fn = func(v int64) { i = v }
  fn(0)
}

With ctx register (arm64/riscv64/x86*)

Caller (main) writes ctx register and calls the real function symbol:

; fn = cfunc
store ptr @__llgo_closure_const$cfunc, ptr %fn_slot
; fn(0)
%fv = load ptr, ptr %fn_slot
%fnptr = load ptr, ptr %fv
%env_base = getelementptr i8, ptr %fv, i64 16  ; &env[0]
call void @llvm.write_register.i64(metadata !"x26", i64 %env_base)
call void %fnptr(i64 0)

; fn = closure
%fv2 = call ptr @AllocU(i64 24)          ; {fn,hasCtx,env0}
store ptr @main$1, ptr %fv2
store i64 1, ptr (getelementptr i8, ptr %fv2, i64 8)
store ptr %i, ptr (getelementptr i8, ptr %fv2, i64 16)
store ptr %fv2, ptr %fn_slot
; fn(0) ... same pattern ...

Closure body (main$1) reads ctx register at entry:

define void @main$1(i64 %v) {
entry:
  %env_base = call i64 @llvm.read_register.i64(metadata !"x26")
  %env0p = inttoptr i64 %env_base to ptr
  %i_ptr = load ptr, ptr %env0p
  store i64 %v, ptr %i_ptr
  ret void
}

C function remains a normal symbol:

declare void @cfunc(i64)

Without ctx register (wasm/arm)

Caller (main) branches on hasCtx to pick the correct signature:

%has = load i64, ptr (getelementptr i8, ptr %fv, i64 8)
%has_i1 = icmp ne i64 %has, 0
br i1 %has_i1, label %with, label %plain

with:
  %fnptr = load ptr, ptr %fv
  %env_base = getelementptr i8, ptr %fv, i64 16
  %fp1 = bitcast ptr %fnptr to ptr (ptr, i64)*
  call void %fp1(ptr %env_base, i64 0)
  br label %done

plain:
  %fnptr2 = load ptr, ptr %fv
  %fp2 = bitcast ptr %fnptr2 to ptr (i64)*
  call void %fp2(i64 0)
  br label %done

Closure body (main$1) takes an explicit ctx parameter:

define void @main$1(ptr %env_base, i64 %v) {
entry:
  %i_ptr = load ptr, ptr %env_base
  store i64 %v, ptr %i_ptr
  ret void
}

`__llgo_closure_const$...`

These are constant closure objects in read‑only data. They:

carry env without heap allocation, and
provide stable, deduplicated closure identities for type metadata (map hash/eq helpers, etc.).

Discussion: Alternative Layout (difference only)

Alternative split layout:

type closure struct {
  fn ptr
  data *struct { hasCtx bool; env ... }
}

Differences vs *funcval (no value judgment):

The closure value is always two words, and the data object contains hasCtx + env.
Call sites need one extra indirection (data) to reach env.
Constant closures can be represented as two constants (closure + data), instead of a single funcval constant.
The data object may be heap allocated or embedded in other objects depending on escape/placement.

Covered Scenarios

Plain funcs, captured closures, method values/expressions, interface method values, varargs, go/defer, C callbacks.

gemini-code-assist · 2026-01-16T00:14:07Z

Note

The number of changes in this pull request is too large for Gemini Code Assist to generate a summary.

xgopilot · 2026-01-16T00:18:03Z

Code Review Summary

This PR implements a well-designed register-based closure context mechanism. The architecture is sound with appropriate register selection (callee-saved registers), proper save/restore semantics, and comprehensive test coverage.

Strengths:

Comprehensive test coverage across all closure patterns
Register pollution tests for both C interop and nested closures
Proper thread safety through per-goroutine register isolation
Efficient wrapper generation with tail call optimization

Minor Issues to Address:

Unused noop() function in regpollute/in.go
Missing direct method call tests in go/in.go vs defer/in.go
Documentation could clarify wrapper naming variations

Overall, this is a solid implementation ready for merge after addressing minor issues.

cl/_testgo/regpollute/in.go

cl/_testcall/go/in.go

codecov · 2026-01-16T15:02:47Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 90.01%. Comparing base (5899edf) to head (cfdce00).
⚠️ Report is 4 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1568      +/-   ##
==========================================
- Coverage   91.01%   90.01%   -1.01%     
==========================================
  Files          45       47       +2     
  Lines       11971    12328     +357     
==========================================
+ Hits        10896    11097     +201     
- Misses        899     1036     +137     
- Partials      176      195      +19

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

zhouguangyuan0718 · 2026-01-30T13:45:08Z

internal/ctxreg/ctxreg.go

+	if info.Name == "" {
+		return nil
+	}
+	return []string{"-mllvm", "--reserve-regs-for-regalloc=" + info.Name}


This option is for aarch64 only, it will not work for other archs. reference to here. The option to reserve register in llvm is not same for all archs. For aarch64 and riscv64, should use command option "-mattr=+reserve-x26" ,reference to here and here. Or set the "target-feature" attr to the function, like:

define void @reserve_x26() "target-features"="+neon,+reserve-x26"

Unfortunately, as I know, similar feature is not supported for x86......

Thanks for the note — I rechecked this across targets with both clang and llc.

-mllvm --reserve-regs-for-regalloc= works across all tested platforms. Ubuntu amd64 CI already passes with this.

-mattr=+reserve- is llc-only. clang does not accept -mattr (it always reports “unknown argument”).

-ffixed-* and +reserve-* are not portable: some targets reject them outright, others ignore the feature.

So the most portable path is still -mllvm --reserve-regs-for-regalloc=.... -mattr only makes sense when driving llc
directly, and -ffixed/+reserve are target‑specific.

Evidence (x86_64/i386):

x86_64: -ffixed (clang)

clang -target x86_64-unknown-linux-gnu -x ir -c x86_readwrite.ll -o /tmp/a.o -ffixed-r12

clang: error: unknown argument '-ffixed-r12'; did you mean '-ffixed-r19'?

x86_64: +reserve (clang)

clang -target x86_64-unknown-linux-gnu -x ir -c x86_readwrite.ll -o /tmp/a.o
-Xclang -target-feature -Xclang +reserve-r12

'+reserve-r12' is not a recognized feature for this target (ignoring feature)
fatal error: error in backend: Invalid register name global variable

i386: -ffixed (clang)

clang -target i386-unknown-linux-gnu -x ir -c i386_readwrite.ll -o /tmp/a.o -ffixed-esi

clang: error: unknown argument: '-ffixed-esi'

i386: +reserve (clang)

clang -target i386-unknown-linux-gnu -x ir -c i386_readwrite.ll -o /tmp/a.o
-Xclang -target-feature -Xclang +reserve-esi

'+reserve-esi' is not a recognized feature for this target (ignoring feature)

x86_64: --reserve-regs-for-regalloc (clang, simple IR)

clang -target x86_64-unknown-linux-gnu -x ir -c simple.ll -o /tmp/a.o
-mllvm --reserve-regs-for-regalloc=r12

(no error)

Extra note on -mattr:
-mattr=+reserve-* is accepted by llc (e.g. AArch64/RISC‑V), but clang does not accept -mattr at all:

clang: error: unknown argument: '-mattr=+reserve-x26'

zhouguangyuan0718 · 2026-01-30T13:58:48Z

internal/llgen/normalize.go

+	}{
+		"amd64":   {writeFmt: "mov \\$0, %%%s", readFmt: "mov %%%s, \\$0"},
+		"386":     {writeFmt: "mov \\$0, %%%s", readFmt: "mov %%%s, \\$0"},
+		"arm64":   {writeFmt: "mov %s, \\$0", readFmt: "mov \\$0, %s"},


Reference to the previous comment. If ignore the support issue for amd64 and use "+reserve-x26" to reserve the register. Then use an llvm intrinsic to access the register is better. Reference to llvm.read_register and llvm.write_register. It will avoid the redundant "move instruction" and the sideeffect.

llvm.read_register / llvm.write_register support

x86_64: not supported (fails with Invalid register name global variable)

i386: works (tested with esi)

AArch64: works (tested with x26)

RISC‑V 32/64: works (tested with x27)

Other targets (ARMv7/MIPS/PPC/S390x/WASM/AVR/Xtensa, etc.) are untested; LangRef warns allocatable GPR support is limited.

IR becomes longer + needs memory barrier

Inline asm has ~{memory} baked in, which is a compiler‑level barrier.

If we switch to intrinsics, we must add explicit memory clobber to keep the same ordering.

Minimal equivalent is 2 IR instructions per read/write:

call void asm sideeffect "", "~{memory}"() %v = call i64 @llvm.read_volatile_register.i64(metadata !0) call void asm sideeffect "", "~{memory}"() call void @llvm.write_register.i64(metadata !0, i64 %val) (So read+write becomes 4 IR instructions instead of 2.)

Because x86_64 does not support the intrinsics, and the intrinsic path adds extra IR (requires memory barriers), it is safer
and simpler to keep the current inline asm approach rather than switching.

Support on x86 should be a separate issue, now only these register are supported. For AArch64/Riscv, if using target-feature to reserve the register, llvm.read_register / llvm.write_register should work.

Why we need the memory barrier? IMO, the final assembly instructions order may be scheduled different with the IR order, but should not broke the semantics. Or are there some other reason?
And the instruction count in IR is not same with the assembly instructions. If using inline asm, the asm code in the inlineASM expression will be kept in the final asm. But if using intrinsic, it will not generate extra instruction in asm code, just using the register directly.
Which mean:

InlineASM:

%1 = call ptr asm sideeffect "mov $0, x26", "=r,~{memory}"() %2 = load { ptr }, ptr %1, align 8

==>

mov x0, x26 // x0 is selected by llvm, maybe other, but this mov instruction will not be eliminated ldr x1, [x0]

intrinsic:

%1 = call i64 @llvm.read_register.i64(metadata !7) %2 = load { ptr }, ptr %1, align 8 !7 = !{!"x26"}

==>

ldr x1, [x26] // x1 is selected by llvm, maybe other, but there is no mov instruction, the reserved register can be accessed directly.

Updated. For caller-saved x86, inline asm uses a memory clobber; callee-saved targets do not.

zhouguangyuan0718 · 2026-01-30T14:03:06Z

ssa/expr.go

+		dialect: llvm.InlineAsmDialectATT,
+	},
+	"arm64": {
+		write:   "mov %s, $0",


Same with #1568 (comment)

cpunion force-pushed the closure-ctxreg branch from 387a4f1 to b39916b Compare January 16, 2026 00:14

xgopilot bot reviewed Jan 16, 2026

View reviewed changes

cl/_testgo/regpollute/in.go Show resolved Hide resolved

xgopilot bot reviewed Jan 16, 2026

View reviewed changes

cl/_testcall/go/in.go Show resolved Hide resolved

cpunion force-pushed the closure-ctxreg branch 3 times, most recently from 26e9b99 to b866d12 Compare January 16, 2026 00:23

cpunion force-pushed the closure-ctxreg branch 3 times, most recently from 314b808 to 9c9df88 Compare January 17, 2026 11:06

visualfc approved these changes Jan 17, 2026

View reviewed changes

cpunion force-pushed the closure-ctxreg branch 3 times, most recently from eccba46 to 1107d49 Compare January 22, 2026 01:12

zhouguangyuan0718 reviewed Jan 30, 2026

View reviewed changes

cpunion added 2 commits February 2, 2026 10:43

doc: update closure design

a27b67b

closure: switch funcval layout and update tests

247db3e

cpunion force-pushed the closure-ctxreg branch from c6d3e64 to 247db3e Compare February 2, 2026 02:58

cpunion added 3 commits February 2, 2026 14:04

doc: update closure design details

6638c4a

closure: use inline asm ctx reg and normalize IR

fa1bbe0

fix: failed tests after closure ctx passing changed

cfdce00

cpunion marked this pull request as draft February 2, 2026 16:14

cpunion mentioned this pull request Feb 2, 2026

Implement register-based closure ctx #1599

Merged

cpunion closed this Feb 3, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement register-based closure ctx#1568

Implement register-based closure ctx#1568
cpunion wants to merge 5 commits intogoplus:mainfrom
cpunion:closure-ctxreg

cpunion commented Jan 16, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot commented Jan 16, 2026

Uh oh!

xgopilot bot commented Jan 16, 2026

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Jan 16, 2026 •

edited

Loading

Uh oh!

zhouguangyuan0718 Jan 30, 2026 •

edited

Loading

Uh oh!

cpunion Jan 31, 2026

Uh oh!

zhouguangyuan0718 Jan 30, 2026 •

edited

Loading

Uh oh!

cpunion Jan 31, 2026

Uh oh!

zhouguangyuan0718 Feb 1, 2026 •

edited

Loading

Uh oh!

cpunion Feb 2, 2026

Uh oh!

zhouguangyuan0718 Jan 30, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

cpunion commented Jan 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

ABI / Representation

Calling Convention

getClosurePtr

Context Register Mapping

Example IR (closure + C func)

With ctx register (arm64/riscv64/x86*)

Without ctx register (wasm/arm)

__llgo_closure_const$...

Discussion: Alternative Layout (difference only)

Covered Scenarios

Uh oh!

gemini-code-assist bot commented Jan 16, 2026

Uh oh!

xgopilot bot commented Jan 16, 2026

Code Review Summary

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Jan 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

zhouguangyuan0718 Jan 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cpunion Jan 31, 2026

Choose a reason for hiding this comment

x86_64: -ffixed (clang)

x86_64: +reserve (clang)

i386: -ffixed (clang)

i386: +reserve (clang)

x86_64: --reserve-regs-for-regalloc (clang, simple IR)

Uh oh!

zhouguangyuan0718 Jan 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cpunion Jan 31, 2026

Choose a reason for hiding this comment

Uh oh!

zhouguangyuan0718 Feb 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

InlineASM:

intrinsic:

Uh oh!

cpunion Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

zhouguangyuan0718 Jan 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

cpunion commented Jan 16, 2026 •

edited

Loading

`__llgo_closure_const$...`

codecov bot commented Jan 16, 2026 •

edited

Loading

zhouguangyuan0718 Jan 30, 2026 •

edited

Loading

zhouguangyuan0718 Jan 30, 2026 •

edited

Loading

zhouguangyuan0718 Feb 1, 2026 •

edited

Loading

zhouguangyuan0718 Jan 30, 2026 •

edited

Loading