Skip to content

fix(build): ensure linker always consumes .a archives#1526

Merged
xushiwei merged 14 commits intogoplus:mainfrom
cpunion:fix/cache-archives-v2
Jan 3, 2026
Merged

fix(build): ensure linker always consumes .a archives#1526
xushiwei merged 14 commits intogoplus:mainfrom
cpunion:fix/cache-archives-v2

Conversation

@cpunion
Copy link
Collaborator

@cpunion cpunion commented Jan 2, 2026

Problem

  1. Inconsistent linker inputs: First build links .o files, subsequent builds link .a archives from cache. This violates build reproducibility.
  2. Wasm cache failure: Wasm builds fail on cache hit because system ar cannot create valid wasm archive indexes that wasm-ld can read.
  3. Unclear data flow: The LLFiles field could contain .o, .ll, or .a files depending on context, making code hard to understand.
  4. -gen-llfiles broken: When enabled, .ll files were packed into .a archives which the linker cannot process.

Fixes #1520

Solution

  • Refactor aPackage with clearer fields:
    • ObjFiles - object files from compiler (.o)
    • ArchiveFile - archive file path for linking (.a)
  • Single-direction data flow:
    .go → .ll → .o → .a → link
    
  • For wasm targets, prefer llvm-ar from PATH (system ar cannot create valid wasm archives)
  • Route archive creation through ctx.archiver() which checks toolchain directory first for cross-compilation
  • -gen-llfiles now copies .ll files for debugging instead of using them for linking
  • Renamed objFiles to linkInputs for clarity (contains both .a archives and .o from main module)

Cache Verification Tests

Added comprehensive build cache verification with snapshot-based testing:

Test Infrastructure

  • Location: test/buildcache/
  • Test Script: test/buildcache/test.sh with 6 test scenarios × 3 modes = 18 tests

Test Modes

Mode Build Command Verification
Native llgo build -o buildcache.out . Execute binary
WASM GOOS=wasip1 GOARCH=wasm llgo build -tags=nogc . Run with iwasm
ESP32-C3 llgo build -target=esp32c3 . Check ELF output

Test Scenarios (each mode)

  1. First build - All packages show CACHE MISS
  2. Second build - Dependencies show CACHE HIT, main shows MISS
  3. Force rebuild (-a) - All packages show CACHE MISS
  4. Dependency change - Modified dep1 invalidates dep2/dep3 (fingerprint cascade)
  5. Partial cache clear (dep2) - Only cleared package rebuilds
  6. Partial cache clear (dep2+dep3) - Multiple cleared packages rebuild

Features

  • ✅ Snapshot-based verification (not just hit count)
  • ✅ Explicit error checking for all build and run commands
  • ✅ Auto-builds iwasm with CI-compatible options if missing (dev/build_iwasm.sh)
  • ✅ All build outputs go to temp directory (keeps source clean)
  • ✅ Cross-platform support (macOS, Linux)
  • ✅ Integrated into CI (build-cache.yml) and local CI (dev/local_ci.sh)

Verbose Output

Added CACHE HIT/CACHE MISS messages to llgo build -v output:

$ llgo build -v .
CACHE MISS: github.com/goplus/llgo/test/buildcache
CACHE HIT: github.com/goplus/llgo/test/buildcache/dep1

Testing

go test ./internal/build -run SaveToCache
go install -tags=dev ./cmd/llgo

# Test native build
rm -rf ~/Library/Caches/llgo/build
cd _demo/c/helloc && llgo build -o helloc -v .
cd _demo/c/helloc && llgo build -o helloc -v .  # second run with cache

# Test wasm build
rm -rf ~/Library/Caches/llgo/build/wasm-wasip1
cd _demo/c/helloc && GOOS=wasip1 GOARCH=wasm llgo build -o hello -tags=nogc -v .
cd _demo/c/helloc && GOOS=wasip1 GOARCH=wasm llgo build -o hello -tags=nogc -v .

# Run comprehensive cache verification tests (18 tests)
bash test/buildcache/test.sh

@gemini-code-assist
Copy link

Summary of Changes

Hello @cpunion, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request refines the build system's handling of compiled artifacts, specifically focusing on the consistent use and caching of .a archives. By introducing a normalization step for initial builds and enhancing the cache saving logic, it ensures that all builds, whether fresh or cached, operate with a standardized archive format. This change improves build reliability, especially for cached builds, while also accommodating cross-compilation scenarios and maintaining compatibility with WASM targets.

Highlights

  • Consistent Archive Usage: Introduced a new normalizeToArchive function to ensure that initial builds consistently produce .a archives from object files, aligning the output format with what cached builds reuse and simplifying the linker's input.
  • WASM Compatibility: The archive normalization process is explicitly skipped for WASM (WebAssembly) targets, maintaining compatibility and avoiding unnecessary processing for this specific architecture.
  • Enhanced Toolchain Integration: Archive creation now routes through the target toolchain via a new archiver() method, which dynamically selects the appropriate archiving tool (e.g., llvm-ar, ar), significantly improving support for cross-compilation scenarios.
  • Refined Cache Saving Logic: The cache saving mechanism has been updated to intelligently handle artifacts: it now prioritizes caching pre-existing .a archives or, if not present, creates one from .o or .ll files, ensuring the cache always stores normalized archives.
  • Atomic File Operations: A new copyFileAtomic utility function was added to ensure robust and safe file copying, particularly for cache operations, by using a temporary file and atomic rename to prevent data corruption.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a refactoring to consistently use archives for linking, both for cached and non-cached builds. The new normalizeToArchive function and changes to saveToCache are central to this. The logic to use llvm-ar for cross-compilation is a good enhancement. Overall, the changes are well-structured. However, I've identified a critical bug in the implementation of normalizeToArchive that could lead to missing files during the link stage. My review includes a suggested fix for this issue.

@xgopilot
Copy link
Contributor

xgopilot bot commented Jan 2, 2026

Code Review Summary

This PR successfully implements archive normalization to ensure consistent linking behavior between cached and non-cached builds. The implementation uses proper atomic operations and error handling. However, there are several noteworthy issues that should be addressed:

Security Concerns:

  • Command injection risk: Object file paths should be validated before passing to the archiver to prevent argument injection attacks
  • Temporary files created in system temp directory without restrictive permissions

Performance:

  • Double archive creation: Packages undergo archiving twice (once in normalizeToArchive, again in saveToCache)
  • Temporary archives not cleaned up, accumulating disk usage

Documentation:

  • Missing explanation for WASM exclusion logic
  • Undocumented LLGO_AR environment variable
  • Inconsistent verbose parameter (variadic bool)

Code Quality:

  • Inconsistent WASM detection (string comparison vs substring search)
  • Missing test coverage for core normalization functions

See inline comments for specific recommendations.

@codecov
Copy link

codecov bot commented Jan 2, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 91.04%. Comparing base (ef4877e) to head (5671a39).
⚠️ Report is 22 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1526      +/-   ##
==========================================
- Coverage   91.07%   91.04%   -0.04%     
==========================================
  Files          45       45              
  Lines       11996    11999       +3     
==========================================
- Hits        10925    10924       -1     
- Misses        895      899       +4     
  Partials      176      176              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@cpunion cpunion changed the title fix: reuse normalized archives without breaking wasm fix(build): ensure linker always consumes .a archives Jan 2, 2026
cpunion and others added 2 commits January 2, 2026 22:56
- Translate Chinese comment to English for consistency
- Remove redundant nil check in archiver() method

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
wasm-ld requires archives created with llvm-ar because system ar
cannot create valid wasm archive symbol indexes. This change:

- Prefers llvm-ar from PATH when available
- Removes wasm skip in normalizeToArchive (no longer needed)
- Shows actual archiver command in verbose output

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
@cpunion cpunion force-pushed the fix/cache-archives-v2 branch from 2b4adb3 to ab4888d Compare January 2, 2026 15:45
cpunion and others added 10 commits January 3, 2026 00:08
Refactor aPackage to have clearer data flow:
- Rename LLFiles to ObjFiles (stores .o files from compiler)
- Add ArchiveFile field (stores .a file path for linking)

Data flow is now single-direction:
  buildPkg: generate .o → ObjFiles
  normalizeToArchive: ObjFiles → ArchiveFile
  link: use ArchiveFile

This makes the code easier to understand and removes the ambiguity
where LLFiles could contain .o, .ll, or .a files depending on context.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
When -gen-llfiles is enabled, copy the .ll file for debugging instead
of using it directly for linking. This ensures the linker always
receives .o files, avoiding issues where .ll files would be incorrectly
packed into .a archives.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Replace duplicate copyFile with existing copyFileAtomic
- Remove unused io import
- Add build-cache.yml CI workflow to test:
  - Native and WASM builds
  - With and without -gen-llfiles
  - With and without -a (force rebuild)
  - Each configuration runs twice to verify cache behavior

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
When -gen-llfiles is enabled, clFile and compileExtraFiles were
outputting .ll files that got packed into archives. This caused
wasm-ld to fail because it can't handle .ll files in archives.

Fix by following the same pattern as exportObject:
- Always compile to .o for linking
- When GenLL=true, also emit .ll for debugging separately

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Add CACHE HIT/MISS verbose output to build.go
- Create test/buildcache with 6 snapshot-based test scenarios:
  1. First build (all CACHE MISS)
  2. Second build (deps CACHE HIT)
  3. Force rebuild with -a (all CACHE MISS)
  4. Dependency change invalidates cache
  5. Partial cache clear (dep2 only)
  6. Partial cache clear (dep2 and dep3)
- Tests run for both native and WASM builds (12 tests total)
- Add dev/build_iwasm.sh to build iwasm with CI-compatible options
- test.sh auto-builds iwasm if not found in llgo cache
- All build outputs go to temp directory to keep source clean
- Add explicit error checking for all build and run commands
- Update CI workflow and local_ci.sh to use test.sh

Closes goplus#1526
Prevents Go compiler from trying to build llgo-only test code
that uses C interop via github.com/goplus/lib/c
- Renamed objFiles to linkInputs to clarify it contains both .a archives and .o objects
- Removed unreachable else branches that would append ObjFiles (all packages now use ArchiveFile)
- Changed panic to proper error handling using closure variable in packages.Visit
- Added error check to ensure all packages have ArchiveFile before linking
- Main module (.o) and extra files are directly added to linkInputs

This makes the linking logic clearer: all packages use .a archives,
only the generated main module uses .o files directly.
…sent

- Removed else branch that would append ObjFiles
- Packages without ArchiveFile (e.g., runtime when not needed) are simply skipped
- This is correct behavior: if a package has no compiled code, it has nothing to link

The simple if-check is sufficient because:
1. Packages with code will have ArchiveFile (from cache or normalizeToArchive)
2. Packages without code (e.g., unused runtime) legitimately have no ArchiveFile
3. No need for error checking - absence of ArchiveFile is valid
No longer needed since we don't validate missing ArchiveFile
- Added ESP32-C3 tests using same run_test_suite (6 test scenarios)
- Updated run_test_suite to support native/wasm/esp32c3 modes
- Added esptool.py installation for Linux CI
- Total tests: 18 (6 native + 6 wasm + 6 esp32c3)
@xushiwei xushiwei merged commit bba9538 into goplus:main Jan 3, 2026
44 of 45 checks passed
cpunion added a commit to cpunion/llgo that referenced this pull request Jan 22, 2026
- Add CACHE HIT/MISS verbose output to build.go
- Create test/buildcache with 6 snapshot-based test scenarios:
  1. First build (all CACHE MISS)
  2. Second build (deps CACHE HIT)
  3. Force rebuild with -a (all CACHE MISS)
  4. Dependency change invalidates cache
  5. Partial cache clear (dep2 only)
  6. Partial cache clear (dep2 and dep3)
- Tests run for both native and WASM builds (12 tests total)
- Add dev/build_iwasm.sh to build iwasm with CI-compatible options
- test.sh auto-builds iwasm if not found in llgo cache
- All build outputs go to temp directory to keep source clean
- Add explicit error checking for all build and run commands
- Update CI workflow and local_ci.sh to use test.sh

Closes goplus#1526
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Link step should always consume .a archives

2 participants