-
Notifications
You must be signed in to change notification settings - Fork 11
Description
Background
In PR #662, the CI matrix fails only on macos amd64 with llgo 0.12.2.
Other matrix targets (linux arm64/amd64, macos arm64) do not show this failure.
Failing test:
--- FAIL: TestEnd2End/sqlite3/3.49.1 (19.45s)Failure Symptom
From the failed artifact (macos-15-intel-log.zip):
panic: close /var/folders/.../compose_1723566520.h: bad file descriptor
[0x02352E6D github.com/goplus/llcppg/cmd/llcppg.main+0x28, SP = 0x70d]
[0x02355A67 main+0x4, SP = 0x27]
[0x0F9AC530 start+0x5, SP = 0xbf0]This aligns with the TestEnd2End/sqlite3/3.49.1 failure window.
Environment
- OS:
macos-15-intel(darwin/amd64) - llgo:
v0.12.2-0.20260210235731-9c8b6b3df1ac(darwin/amd64) - Go (control):
go1.23.6 darwin/amd64
Minimal Reproduction
This reproducer does not depend on llcppg logic. It only does concurrent CreateTemp + Close + Remove.
package main
import (
"fmt"
"os"
"sync"
)
const (
goroutines = 2
iterations = 1000
)
func worker(n int, errs chan<- error) {
for i := 0; i < n; i++ {
f, err := os.CreateTemp("", "compose_*.h")
if err != nil {
errs <- err
return
}
name := f.Name()
if err := f.Close(); err != nil {
errs <- fmt.Errorf("close %s: %w", name, err)
return
}
if err := os.Remove(name); err != nil {
errs <- err
return
}
}
}
func main() {
errs := make(chan error, goroutines)
var wg sync.WaitGroup
for i := 0; i < goroutines; i++ {
wg.Add(1)
go func() {
defer wg.Done()
worker(iterations, errs)
}()
}
wg.Wait()
close(errs)
for err := range errs {
if err != nil {
panic(err)
}
}
fmt.Printf("ok goroutines=%d iterations=%d\n", goroutines, iterations)
}Run:
# Fails under llgo (concurrent case)
llgo run repro.go
# panic: close ... bad file descriptor
# Passes under Go toolchain
go run repro.go
# ok goroutines=2 iterations=1000Additional Repro (single-thread, deterministic)
The issue is reproducible even without concurrency:
package main
import (
"fmt"
"os"
"path/filepath"
"syscall"
)
func main() {
p := filepath.Join(os.TempDir(), fmt.Sprintf("open-fail-%d.tmp", os.Getpid()))
_ = os.WriteFile(p, []byte("x"), 0600)
defer os.Remove(p)
fd, err := syscall.Open(p, syscall.O_CREAT|syscall.O_EXCL|syscall.O_RDWR, 0600)
fmt.Printf("fd=%d hex=%#x err=%v\n", fd, uint64(fd), err)
}Observed output:
# go run
fd=4294967295 hex=0xffffffff err=file exists
# llgo run
fd=4294967295 hex=0xffffffff err=<nil>This means the syscall failed (invalid fd 0xffffffff) but err was lost under llgo.
Root Cause Analysis (likely)
os.CreateTempinternally relies onOpenFile(..., O_CREATE|O_EXCL, ...).- Under contention,
openmay returnEEXIST(normal behavior). - With llgo on
darwin/amd64, this failure can be interpreted aserr=nilwhile returning invalid fd (0xffffffff). - The invalid fd then propagates into
*os.File, and later operations (Write/Stat/Close) fail withbad file descriptor. - Concurrency amplifies filename-collision opportunities, so the bug appears mostly in concurrent test runs.
Likely implementation mismatch in llgo syscall lowering:
- errno is derived only when
r1 == ^uintptr(0). - on this path, failing C calls may produce a 32-bit
-1shape (0x00000000ffffffff) rather than full-width^uintptr(0)(0xffffffffffffffff) on amd64. - then errno is not captured, causing
err=nilon failure.
Expected Behavior
No panic; temp file descriptors should remain valid and close cleanly under concurrency.
Actual Behavior
On llgo v0.12.2 + darwin/amd64, concurrent temp-file usage can panic with bad file descriptor.
Impact on llcppg
End-to-end concurrent tests (notably sqlite3) can fail on this specific toolchain/platform combination.
Temporary Workarounds
- CI: reduce concurrency or skip
macos amd64 + llgo 0.12.2. - Code: avoid holding temp-file descriptors longer than necessary (helpful, but does not fully eliminate a runtime-level issue).
Artifacts
/Users/heulucklu/Downloads/logs_58617825193.zip/Users/heulucklu/Downloads/macos-15-intel-log.zip