Skip to content

Added non-conflicting hash for install files#1454

Merged
shs96c merged 7 commits into
bazel-contrib:masterfrom
MarconZet:master
Jan 21, 2026
Merged

Added non-conflicting hash for install files#1454
shs96c merged 7 commits into
bazel-contrib:masterfrom
MarconZet:master

Conversation

@MarconZet
Copy link
Copy Markdown
Contributor

@MarconZet MarconZet commented Sep 29, 2025

Sumary

This commit introduces lock file version 3 with per-artifact hashing instead of a single global hash.

This per-artifact hashing approach can reduce the amount of merge conflicts when multiple people update canonical version in large monorepo.

The code still supports reading v2 lock files - it checks for v3 first, then falls back to v2, then v1. Users with older lock files will see a message to repin.

Key Changes

  1. Lock File Format Change (v2 → v3)
  • Before (v2): __INPUT_ARTIFACTS_HASH and __RESOLVED_ARTIFACTS_HASH were single integer values
  • After (v3): Both are now dictionaries mapping each artifact coordinate to its individual hash

Example in maven_install.json:

// Old format 
"__INPUT_ARTIFACTS_HASH": 1994476565, 
"__RESOLVED_ARTIFACTS_HASH": -274973469,
// New format
"__INPUT_ARTIFACTS_HASH": { "com.google.guava:guava": 733518530, "junit:junit": -652553691, "..." }, 
"__RESOLVED_ARTIFACTS_HASH": { "com.google.guava:guava": -1587873388, "..." }
  1. Hash Computation Changes (private/rules/v3_lock_file.bzl:53-108)

The new _compute_lock_file_hash_v3 function computes individual hashes per artifact that include:

  • The artifact's own info (coordinates, SHA sums)
  • The repository it came from
  • Hashes of all transitive dependencies (dependency-aware hashing)
  1. Input Hash Changes (private/rules/coursier.bzl:334-386)

compute_dependency_inputs_signature now returns a dictionary of per-artifact hashes plus backward-compatible v1/v2 signatures.

Comment thread tests/custom_maven_install/regression_testing_gradle_install.json
@shs96c
Copy link
Copy Markdown
Collaborator

shs96c commented Oct 7, 2025

This is looking really good. I like the idea of only having conflicts if the transitive deps have changed.

Comment thread private/rules/v3_lock_file.bzl Outdated
for (String key : keys) {
toHash.put(key, rendered.get(key));
@SuppressWarnings("unchecked")
private static Map<String, Integer> calculateArtifactHash(Map<String, Object> rendered) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shs96c question for a potential breaking change in the next major update.

It seems like this code and the code in v3_lock_file.bzl are similar. IIRC, the reason the starlark implementation exists is if the user doesn't have a lockfile.
If that is the case, is there a possibility to consolidate around the java code (which is easiest to test tbh) by forcing lockfile usage?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'd likely want to consolidate on the starlark version of the code, since that's the one that's used by people when they verify the signatures.

Copy link
Copy Markdown
Contributor Author

@MarconZet MarconZet Jan 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that it would be the best solution, but it's not simple to implement.

Starlark code runs in analysis phase, this java code runs in execution phase, so it's impossible to do without a minor rewrite of the flow.

@shs96c
Copy link
Copy Markdown
Collaborator

shs96c commented Nov 4, 2025

@MarconZet, I'm waiting until you move this out of draft before reviewing. Please LMK when you're ready!

@MarconZet MarconZet closed this Nov 19, 2025
@MarconZet MarconZet reopened this Nov 19, 2025
@MarconZet MarconZet marked this pull request as ready for review November 19, 2025 13:17
@MarconZet
Copy link
Copy Markdown
Contributor Author

@shs96c any progress on the review?

@thomasbao12
Copy link
Copy Markdown

Could we add a description to the PR like:

Summary

This commit introduces lock file version 3 with per-artifact hashing instead of a single global hash. The main purpose is to create "non-conflicting" hashes that allow more granular change
detection in the maven dependency lock files.

Key Changes

  1. Lock File Format Change (v2 → v3)
  • Before (v2): __INPUT_ARTIFACTS_HASH and __RESOLVED_ARTIFACTS_HASH were single integer values
  • After (v3): Both are now dictionaries mapping each artifact coordinate to its individual hash

Example in maven_install.json:
// Old format
"__INPUT_ARTIFACTS_HASH": 1994476565,
"__RESOLVED_ARTIFACTS_HASH": -274973469,

// New format
"__INPUT_ARTIFACTS_HASH": {
"com.google.guava:guava": 733518530,
"junit:junit": -652553691,
...
},
"__RESOLVED_ARTIFACTS_HASH": {
"com.google.guava:guava": -1587873388,
...
}

  1. File Renames
  • v2_lock_file.bzl → v3_lock_file.bzl
  • V2LockFile.java → V3LockFile.java
  • V2LockFileTest.java → V3LockFileTest.java
  1. Hash Computation Changes (private/rules/v3_lock_file.bzl:53-108)

The new _compute_lock_file_hash_v3 function computes individual hashes per artifact that include:

  • The artifact's own info (coordinates, SHA sums)
  • The repository it came from
  • Hashes of all transitive dependencies (dependency-aware hashing)
  1. Input Hash Changes (private/rules/coursier.bzl:334-386)

compute_dependency_inputs_signature now returns a dictionary of per-artifact hashes plus backward-compatible v1/v2 signatures.

  1. Command-line Interface Change (pin_dependencies.bzl)

Changed from --input_hash (single value) to --input-hash-path (path to JSON file containing the hash dictionary).

  1. Backward Compatibility

The code still supports reading v2 lock files - it checks for v3 first, then falls back to v2, then v1. Users with older lock files will see a message to repin.

Purpose

This per-artifact hashing approach allows the system to detect exactly which artifacts changed, rather than just knowing "something changed." This is useful for incremental updates and
more precise cache invalidation.

@honnix
Copy link
Copy Markdown
Contributor

honnix commented Dec 18, 2025

We tried this patch and so far it has been working well. There is one thing though. In case of mismatched signature,

"%s_install.json contains an invalid signature (expected %s and got %s) and may be corrupted. " % (
prints out a huge single line of artifact shas, for each every artifact. In our case it causes ~2GB of logs.

@MarconZet
Copy link
Copy Markdown
Contributor Author

@honnix I changed the code, It should print errors better now

@honnix
Copy link
Copy Markdown
Contributor

honnix commented Jan 8, 2026

@honnix I changed the code, It should print errors better now

Nice! Thank you. We will take the new patch and try it out.

@vinnybod
Copy link
Copy Markdown
Contributor

FWIW, We've been using this at Confluent for a month now and it has been working well.

Comment thread private/rules/v3_lock_file.bzl Outdated
Comment thread private/rules/v3_lock_file.bzl Outdated
for (String key : keys) {
toHash.put(key, rendered.get(key));
@SuppressWarnings("unchecked")
private static Map<String, Integer> calculateArtifactHash(Map<String, Object> rendered) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'd likely want to consolidate on the starlark version of the code, since that's the one that's used by people when they verify the signatures.


def _add_to_hash_dictionary(dictionary, artifact, salt):
artifact_dict = json.decode(artifact)
key = artifact_dict["group"] + ":" + artifact_dict["artifact"]
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should use the same logic that's in Coordinates.asKey() to get a stable key that includes things like the classifier. That's already in coordinates.bzl as to_key

Copy link
Copy Markdown
Contributor Author

@MarconZet MarconZet Jan 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, this key has a direct mapping to what appears in the lock file

My idea was, that I don't want things like classifier and packaging, my reason being:

  1. It looked ugly in the lock file – when I tried it in my repo, the __INPUT_ARTIFACTS_HASH size duplicated with :sources. It did not provide any information and was just noise.
  2. I think that at this level, we don't want to allow a non-conflicting merge.

if boms and len(boms):
for bom in sorted(boms):
artifact_inputs.append(_stable_artifact(bom))
_add_to_hash_dictionary(all_hashes, bom, "bom")
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the salt needed for artifacts and boms? They should have unique coordinates no matter what.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did this thinking about artifact and excluded_artifact.

In v2 hash, excluding an artifact would change the hash because the has order changed. In group:artifact: HASH notation, excluding an artifact would not change the hash, so we need salt.

The rest is about code design principles, adding salt everywhere is easier then adding salt only to excluded_artifact.

@shs96c shs96c mentioned this pull request Jan 16, 2026
4 tasks
@shs96c
Copy link
Copy Markdown
Collaborator

shs96c commented Jan 19, 2026

Let me handle the rebase, and I'll merge this when I've done so.

@shs96c
Copy link
Copy Markdown
Collaborator

shs96c commented Jan 20, 2026

Ah! I can't do the rebase. Could you please handle that?

@shs96c
Copy link
Copy Markdown
Collaborator

shs96c commented Jan 21, 2026

The test failures look related to this change. The V3LockFileTest is failing.

@MarconZet
Copy link
Copy Markdown
Contributor Author

@shs96c I forgot to add some files, It should be ok now

@shs96c shs96c enabled auto-merge (squash) January 21, 2026 21:04
Copy link
Copy Markdown
Collaborator

@shs96c shs96c left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's go! LGTM

@shs96c shs96c merged commit 50b2011 into bazel-contrib:master Jan 21, 2026
6 checks passed
shs96c added a commit to JonathanPerry651/rules_jvm_external that referenced this pull request Feb 3, 2026
* master: (25 commits)
  fix: use forward slash separator in Maven purl format (bazel-contrib#1530)
  Load rules from specific bzl files and add sh_test imports (bazel-contrib#1529)
  Added non-conflicting hash for install files (bazel-contrib#1454)
  Update the maven and coursier resolver tests to create a class index file. (bazel-contrib#1519)
  [ci] Drop Bazel 6 and ensure we run on Bazel 7 and 8 (bazel-contrib#1525)
  Only allow modules specified in known_contributing_modules to contribute artifacts or boms to the root module (bazel-contrib#1523)
  [gradle] Fix false resolution failures when BOM upgrades dependency version (bazel-contrib#1520)
  [gradle] Fix Gradle resolver to respect force_version and include runtime dependencies (bazel-contrib#1516)
  Correctly merge BOMs from non-root modules (bazel-contrib#1518)
  Update more lock files
  Filter test_only artifacts out of artifacts merged into root repos and print a warning when a root artifact version is overridden by a non_root bazel_dep (bazel-contrib#1511)
  Fix SHA mismatch for conflicting dependency versions (bazel-contrib#1513)
  [gradle] Plumb through the force_version attribute (bazel-contrib#1515)
  [gradle] Add dep exclusions to only that dep (bazel-contrib#1514)
  [gradle] Handle aggregating dependencies and relocation version conflicts (bazel-contrib#1512)
  BOM Fixes (bazel-contrib#1506)
  Allow an optional index of dep -> class to be created (bazel-contrib#1492)
  Put files in `ResolutionResult` (bazel-contrib#1484)
  Optimize dependency graph building with O(1) lookups (bazel-contrib#1483)
  Provide a mechanism to list all resolved direct deps for a workspace (bazel-contrib#1510)
  ...
shs96c added a commit to shs96c/rules_jvm_external that referenced this pull request Feb 3, 2026
* master:
  Add presubmit check for prebuilt jars (bazel-contrib#1486)
  Upload artifacts in parallel (address artifactorys "Maven Snapshot Version Behaviour") (bazel-contrib#1524)
  feat: Support COURSIER_SHA256 environment variable (bazel-contrib#1527)
  fix: Do not add coursier opts when run other tools (bazel-contrib#1531)
  fix: add string attributes to `amend_artifact` for explicit unset state (bazel-contrib#1499)
  fix: use forward slash separator in Maven purl format (bazel-contrib#1530)
  Load rules from specific bzl files and add sh_test imports (bazel-contrib#1529)
  Added non-conflicting hash for install files (bazel-contrib#1454)
  Update the maven and coursier resolver tests to create a class index file. (bazel-contrib#1519)
  [ci] Drop Bazel 6 and ensure we run on Bazel 7 and 8 (bazel-contrib#1525)
  Only allow modules specified in known_contributing_modules to contribute artifacts or boms to the root module (bazel-contrib#1523)
  [gradle] Fix false resolution failures when BOM upgrades dependency version (bazel-contrib#1520)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants