Skip to content

A concurrency issue with git detached processing in Repo Server #25101

@dudinea

Description

@dudinea

Checklist:

  • I've searched in the docs and FAQ for my answer: https://bit.ly/argocd-faq.
  • I've included steps to reproduce the bug.
  • I've pasted the output of argocd version.

Describe the bug

On some of our heavily loaded ArgoCD instances, which produce lots of sync operations,
repo-server Git operations occasionally fail with errors like

rpc error: code = Internal desc = Failed to checkout FETCH_HEAD: `git checkout --force FETCH_HEAD` failed exit status 128: fatal: update_ref failed for ref 'HEAD': cannot lock ref 'HEAD': Unable to create '<path to cached source>/.git/HEAD.lock': File exists.\n\nAnother git process seems to be running in this repository, e.g.\nan editor opened by 'git commit'. Please make sure all processes\nare terminated then try again. If it still fails, a git process\nmay have crashed in this repository earlier:\nremove the file manually to continue.\n

There are no Git processes/Pods crashes, OOM invocations or any other
unexpected messages in the logs before the above error.

RCA

The investigation showed that the problem started with the upgrade of
our base image to one based Ubuntu 25.04 (Plucky Puffin).
The same base image is used by ArgoCD 3.2 release candidates.

The new image comes with Git version v2.48.1 while ArgoCD 3.1.x is based on v2.43.0

We found out, that the new Git Version runs Git automatic maintenance
git maintenance run in the background (with --detach option) after the
exit of the main git command, thus escaping the ArgoCD repository
locking mechanism and causing a race condition, which occasionally
causes the above error in our case and potentially may cause other
issues.

This behaviour was introduced in Git 2.47.0,
I believe with git/git@98077d0.

In the default configuration git-maintenance auto will run only the
GC task, which checks if GC is needed, and If it is, it runs a set of
operations on the repo, potentially causing various issues if the repo
is accessed concurrently from the next Repo Server operation (see man
git-prune(1), git-reflog(1), git-repack(1), git-rerere(1)). So error
message may potentially mention other types of lock files, not only
HEAD.lock.

Strangely, according to the comment of the above commit, this change
is said to keep with the previous behaviour of running GC in
background, but the detach previously was happening in git-gc (only if
GC is needed), while git-maintenance was always running in
foreground. However in our environment we haven't been able to
reproduce the bug with previous Git versions.
This is still under investigation.

Ways to fix

This is an excerpt from Git documentation on Git configuration options
regarding this feature:

maintenance.autoDetach::
Many Git commands trigger automatic maintenance after they have
written data into the repository. This boolean config option
controls whether this automatic maintenance shall happen in the
foreground or whether the maintenance process shall detach and
continue to run in the background.

If unset, the value of gc.autoDetach is used as a fallback. Defaults
to true if both are unset, meaning that the maintenance process will
detach.

To be on the safe side we propose configuring both these values in the
ArgoCD image to false in Git system configuration (/etc/gitconfig).
This will disable detached operation, whether by git-maintenance or
by git-gc.

[maintenance]
	autoDetach = false
[gc]
	autoDetach = false

Such a fix may have performance implications, which we're trying to estimate.

We've implemented the fix in our environments and it fixed the problem.

Possibly related issues

#21017: This issue is probably related, but has no information if such errors were related to git/pod crashes.
#17623: same message, but related to Git crashing

To Reproduce

We still do not have a ready script to cause the errors. Working on it.

Expected behavior

Git commands, that are run from argocd-repo-server, must not leave any background processes running
after exit of the main git command.

Version

argocd: v3.2.0+323f993
  BuildDate: 2025-10-27T11:24:03Z
  GitCommit: 323f99381686b62daf6e70705f56f94d2b30aa3f
  GitTreeState: clean
  GoVersion: go1.24.4
  Compiler: gc
  Platform: darwin/arm64

Logs

Here log of Argocd Repo Server

{"level":"debug","msg":"Checking out revision 5422a9cf00eabe22600b4f7ef3f3908134ce33e9","skipFetch":false,"time":"2025-10-21T14:23:30Z"}
{"dir":"/tmp/_argocd-repo/6b638c9e-9f39-4067-8627-28d3842070e6","execID":"a6183","level":"info","msg":"git fetch origin --tags --force --prune","time":"2025-10-21T14:23:30Z"}

Excerpt from GIT trace when run from the repo server running "git fetch".
Such trace can be produced by setting the GIT_TRACE env. var for argocd-repo-server
to some file in /tmp/. Not the --detach option in the last lines

14:23:30.959509 run-command.c:667       trace: run_command: GIT_DIR=.git git remote-https origin https://github.com/REDACTED/REDACTED.git
14:23:30.959567 run-command.c:759       trace: start_command: /usr/lib/git-core/git remote-https origin https://github.com/REDACTED/REDACTED.git
14:23:30.961488 git.c:769               trace: exec: git-remote-https origin https://github.com/REDACTED/REDACTED.git
14:23:30.961696 run-command.c:667       trace: run_command: git-remote-https origin https://github.com/REDACTED/REDACTED.git
14:23:30.961725 run-command.c:759       trace: start_command: /usr/lib/git-core/git-remote-https origin https://github.com/REDACTED/REDACTED.git
14:23:31.179897 run-command.c:667       trace: run_command: argocd 'Username for '\\''https://github.com'\\'': '
14:23:31.179932 run-command.c:759       trace: start_command: /usr/local/bin/argocd 'Username for '\\''https://github.com'\\'': '
14:23:31.226821 run-command.c:667       trace: run_command: argocd 'Password for '\\''https://[email protected]'\\'': '
14:23:31.226848 run-command.c:759       trace: start_command: /usr/local/bin/argocd 'Password for '\\''https://[email protected]'\\'': '
14:23:31.929427 run-command.c:667       trace: run_command: git index-pack --stdin --fix-thin '--keep=fetch-pack 94 on argocd-repo-server-68fb8f5dfc-bch24' --pack_header=2,1064
14:23:31.929589 run-command.c:759       trace: start_command: /usr/lib/git-core/git index-pack --stdin --fix-thin '--keep=fetch-pack 94 on argocd-repo-server-68fb8f5dfc-bch24' --pack_header=2,1064
14:23:31.932411 git.c:476               trace: built-in: git index-pack --stdin --fix-thin '--keep=fetch-pack 94 on argocd-repo-server-68fb8f5dfc-bch24' --pack_header=2,1064
14:23:31.970240 run-command.c:667       trace: run_command: git rev-list --objects --stdin --not --exclude-hidden=fetch --all --quiet --alternate-refs
14:23:31.970259 run-command.c:759       trace: start_command: /usr/lib/git-core/git rev-list --objects --stdin --not --exclude-hidden=fetch --all --quiet --alternate-refs
14:23:31.971128 git.c:476               trace: built-in: git rev-list --objects --stdin --not --exclude-hidden=fetch --all --quiet --alternate-refs
From https://github.com/REDACTED/REDACTED
 * [new branch]      dev        -\u003e origin/dev
 * [new branch]      main       -\u003e origin/main
14:23:31.976467 run-command.c:1534      run_processes_parallel: preparing to run up to 1 tasks
14:23:31.976480 run-command.c:1562      run_processes_parallel: done
14:23:31.976490 run-command.c:667       trace: run_command: git maintenance run --auto --no-quiet --detach
14:23:31.976508 run-command.c:759       trace: start_command: /usr/lib/git-core/git maintenance run --auto --no-quiet --detach
14:23:31.977245 git.c:476               trace: built-in: git maintenance run --auto --no-quiet --detach

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingcomponent:monorepoIssue related to the mono-repository pattern and performancecomponent:repo-serverIssue related to the Repository Server component

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions