Skip to content

feat: enable target-triple build cache#1425

Merged
xushiwei merged 6 commits intogoplus:mainfrom
cpunion:feature/build-cache8
Nov 30, 2025
Merged

feat: enable target-triple build cache#1425
xushiwei merged 6 commits intogoplus:mainfrom
cpunion:feature/build-cache8

Conversation

@cpunion
Copy link
Collaborator

@cpunion cpunion commented Nov 29, 2025

Fixes #1361

Summary

  • add a manifest builder that deterministically serializes env/common/pkg/dependency inputs (including overlays and rewrite vars) into a single fingerprint hash
  • fingerprint source inputs via file metadata (size + mtime) and overlay hashes so repeat builds avoid re-reading entire files while still invalidating cache entries when content changes
  • introduce a cache manager that stores .a archives plus manifests under $LLGO_CACHE/build/<target-triple>/<pkg>/<fingerprint>.{a,manifest} and wire it into buildPkg, linkMainPkg, and archive creation so cached packages skip LLVM lowering entirely
  • gate the cache behind LLGO_BUILD_CACHE with graceful fallbacks, make runtime linking conditional on actual NeedRt/NeedPyInit requirements, and ensure manifests capture metadata like linker args for cache hits

Manifest Format

The cache manifest moved from the legacy INI layout to a YAML document so sections can embed structured data (lists/maps) deterministically. Files now record size/mtime (and overlay_hash for in-memory overlays). Example output:

env:
    GOOS: darwin
    GOARCH: arm64
    GO_VERSION: go1.24.9
    LLGO_VERSION: (devel)
    LLVM_VERSION: Homebrew clang version 19.1.7
common:
    ABI_MODE: "2"
    CC: clang++
    CCFLAGS:
        - -Qunused-arguments
        - -Wno-unused-command-line-argument
        - --sysroot=/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX26.1.sdk
    LDFLAGS:
        - -target
        - arm64-apple-macosx
        - -Qunused-arguments
        - -Wno-unused-command-line-argument
        - -Wl,--error-limit=0
        - -fuse-ld=lld
        - --sysroot=/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX26.1.sdk
        - -Xlinker
        - -dead_strip
package:
    pkg_path: github.com/goplus/llgo/test
    pkg_id: github.com/goplus/llgo/test [github.com/goplus/llgo/test.test]
    go_files:
        - path: /Users/lijie/source/goplus/llgo-build-cache6/test/c_test.go
          size: 1138
          mtime: 1732741704000000000
        - path: /Users/lijie/source/goplus/llgo-build-cache6/test/defer_test.go
          size: 2480
          mtime: 1732741704000000000
metadata:
    need_rt: true
deps:
    - id: github.com/goplus/lib/c
      version: v0.3.1
    - id: reflect
      fingerprint: 58f3a7f23c125c55512234d4ddef25285263f9960279c98f9c8973e02e9b8fd2
    - id: strconv
      fingerprint: cc240aadbf22bbfc2be889ebcb28d779d2b7774ca55b1feb220836e915ba50cd
    - id: testing
      fingerprint: 1e487d8ae59998c8007596a214358d99b47f9375a179223f7476e242065eb6b5

Testing

  • go test ./internal/build/...

@gemini-code-assist
Copy link

Summary of Changes

Hello @cpunion, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a comprehensive build caching mechanism to 'llgo', significantly improving build performance by reusing previously compiled packages. It achieves this through deterministic fingerprinting of all build inputs, a robust cache management system for storing compiled artifacts, and intelligent conditional linking of runtime components. The manifest format has been upgraded to YAML for better structure and extensibility, and the entire system is designed for reliability with atomic file operations and graceful fallbacks.

Highlights

  • Build Cache Implementation: A new build cache is introduced, enabling reuse of compiled packages across different builds and targets, significantly improving build performance.
  • Deterministic Fingerprinting: A manifest builder is added to deterministically serialize all build inputs (environment, common configurations, package files, dependencies, rewrite variables) into a unique fingerprint hash.
  • Cache Manager: A cache manager stores compiled '.a' archives and their corresponding manifests under '$LLGO_CACHE/build///.{a,manifest}'.
  • Conditional LLVM Lowering: Cached packages can now skip the LLVM lowering phase entirely, leading to faster compilation times for unchanged dependencies.
  • Runtime Linking Optimization: Runtime and Python initialization linking is now conditional on actual 'NeedRt'/'NeedPyInit' requirements, avoiding unnecessary linking of runtime components.
  • YAML Manifest Format: The cache manifest format has been updated from INI to YAML to support structured data, improving readability and extensibility.
  • Atomic Archive Creation: The process of creating static archives has been made atomic to prevent race conditions during concurrent builds.
  • Cache Gating: The cache functionality is controlled by the 'LLGO_BUILD_CACHE' environment variable, with graceful fallbacks if disabled.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a comprehensive build caching mechanism to improve compilation speed. It works by fingerprinting package inputs (source files, dependencies, environment, build flags) and storing compiled archives (.a files) along with a manifest. The implementation is robust, using atomic file operations to prevent race conditions and a deterministic YAML-based manifest format. The build logic has been significantly refactored to integrate caching, including deferring the build of runtime packages until they are confirmed to be needed. My review found the overall design to be solid. I've provided a few comments on potential issues in fallback paths that could lead to cache inconsistencies and suggestions for minor code simplification.

Comment on lines +225 to +235
temp := &aPackage{Package: dep}
if c.pkgByID == nil {
c.pkgByID = make(map[string]Package)
}
c.pkgByID[dep.ID] = temp
if err := c.collectFingerprint(temp); err != nil {
return entry, fmt.Errorf("collect fingerprint for %s: %w", dep.ID, err)
}
entry.Fingerprint = temp.Fingerprint
return entry, nil
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

This fallback logic for a dependency not found in c.pkgByID creates a minimal temporary aPackage instance just for fingerprinting. However, this temporary instance is also stored in c.pkgByID. If this package is later retrieved during the linking phase, it will be missing crucial information like compiled object files (LLFiles) and linker arguments (LinkArgs), as it never goes through the full buildPkg process. This will likely cause an incomplete or failed link. All packages should be fully discovered and initialized during the buildSSAPkgs phase to avoid this scenario.

Comment on lines +462 to +471
if manifestContent == "" {
// Fallback: rebuild if missing (should not happen in normal flow).
m := newManifestBuilder()
c.collectEnvInputs(m)
c.collectCommonInputs(m)
if err := c.collectPackageInputs(m, pkg); err != nil {
return err
}
manifestContent = m.Build()
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The fallback logic to rebuild the manifest if pkg.Manifest is empty is incomplete. It does not include dependency information, which is part of the original manifest generation in collectFingerprint. If this fallback is ever triggered, it will write a manifest to the cache that does not match the package's fingerprint, as the fingerprint was calculated based on the full manifest including dependencies. This could lead to cache corruption. The manifest should be rebuilt completely, including dependencies, or this fallback should be removed if it's truly unreachable.

Comment on lines +259 to +281
func digestFiles(paths []string) (string, []fileDigest, error) {
if len(paths) == 0 {
return "", nil, nil
}

digests := make([]fileDigest, 0, len(paths))
for _, path := range paths {
hash, err := digestFile(path)
if err != nil {
return "", nil, fmt.Errorf("digest file %q: %w", path, err)
}
digests = append(digests, fileDigest{Path: path, SHA256: hash})
}

sort.Slice(digests, func(i, j int) bool { return digests[i].Path < digests[j].Path })

var parts []string
for _, d := range digests {
parts = append(parts, fmt.Sprintf("%s]sha256:%s", d.Path, d.SHA256))
}

return strings.Join(parts, ","), digests, nil
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The digestFiles function returns a serialized string digest and a slice of fileDigest. However, all call sites in internal/build/collect.go discard the string return value. This legacy string format seems to be unused and could be removed to simplify the function.

func digestFiles(paths []string) ([]fileDigest, error) {
	if len(paths) == 0 {
		return nil, nil
	}

	digests := make([]fileDigest, 0, len(paths))
	for _, path := range paths {
		hash, err := digestFile(path)
		if err != nil {
			return nil, fmt.Errorf("digest file %q: %w", path, err)
		}
		digests = append(digests, fileDigest{Path: path, SHA256: hash})
	}

	sort.Slice(digests, func(i, j int) bool { return digests[i].Path < digests[j].Path })

	return digests, nil
}

Comment on lines +283 to +312
// digestFilesWithOverlay calculates digests for files, using overlay content when available.
func digestFilesWithOverlay(paths []string, overlay map[string][]byte) (string, []fileDigest, error) {
if len(paths) == 0 {
return "", nil, nil
}

digests := make([]fileDigest, 0, len(paths))
for _, path := range paths {
var hash string
if content, ok := overlay[path]; ok {
hash = digestBytes(content)
} else {
var err error
hash, err = digestFile(path)
if err != nil {
return "", nil, fmt.Errorf("digest file %q: %w", path, err)
}
}
digests = append(digests, fileDigest{Path: path, SHA256: hash})
}

sort.Slice(digests, func(i, j int) bool { return digests[i].Path < digests[j].Path })

var parts []string
for _, d := range digests {
parts = append(parts, fmt.Sprintf("%s]sha256:%s", d.Path, d.SHA256))
}

return strings.Join(parts, ","), digests, nil
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Similar to digestFiles, the digestFilesWithOverlay function returns a serialized string that is not used by its callers. This can be removed to simplify the code.

func digestFilesWithOverlay(paths []string, overlay map[string][]byte) ([]fileDigest, error) {
	if len(paths) == 0 {
		return nil, nil
	}

	digests := make([]fileDigest, 0, len(paths))
	for _, path := range paths {
		var hash string
		if content, ok := overlay[path]; ok {
			hash = digestBytes(content)
		} else {
			var err error
			hash, err = digestFile(path)
			if err != nil {
				return nil, fmt.Errorf("digest file %q: %w", path, err)
			}
		}
		digests = append(digests, fileDigest{Path: path, SHA256: hash})
	}

	sort.Slice(digests, func(i, j int) bool { return digests[i].Path < digests[j].Path })

	return digests, nil
}

@codecov
Copy link

codecov bot commented Nov 29, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 90.63%. Comparing base (425ac3a) to head (028925d).
⚠️ Report is 7 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1425      +/-   ##
==========================================
+ Coverage   90.59%   90.63%   +0.03%     
==========================================
  Files          43       43              
  Lines       11400    11400              
==========================================
+ Hits        10328    10332       +4     
+ Misses        911      907       -4     
  Partials      161      161              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@cpunion
Copy link
Collaborator Author

cpunion commented Nov 29, 2025

/review

@xushiwei xushiwei merged commit 883492c into goplus:main Nov 30, 2025
42 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

proposal: add packge build cache

2 participants