driver name nfs.csi.k8s.io not found in the list of registered CSI drivers

**What happened**:
We're running into issues with pods being stuck on ContainerCreating state with the error:
```
corporate-ohayg   11m         Warning   FailedMount                       pod/web-6c5dd4d7b7-g2k2w                                             MountVolume.MountDevice failed for volume "pv-corporate-ohayg-nfs-web" : kubernetes.io/csi: attacher.MountDevice failed to create newCsiDriverClient: driver name nfs.csi.k8s.io not found in the list of registered CSI drivers
```
we check the logs of the csn-nfs-node-xxxx pod running on the same node as the above pod is scheduled on:
```
thenuja.viknarajah@LA-1215 ~  (qa-london:kube-system) ❯ kubectl logs csi-nfs-node-pkhbh --all-containers
I0122 15:22:11.223866       1 main.go:137] "Calling CSI driver to discover driver name"
I0122 15:22:11.226265       1 main.go:145] "CSI driver name" driver="nfs.csi.k8s.io"
I0122 15:22:11.226309       1 main.go:174] "ServeMux listening" address="localhost:29653"
E0122 15:22:55.667666       1 main.go:68] "Failed to establish connection to CSI driver" err="context deadline exceeded"
E0122 15:23:25.667140       1 main.go:68] "Failed to establish connection to CSI driver" err="context deadline exceeded"
E0122 15:23:55.668176       1 main.go:68] "Failed to establish connection to CSI driver" err="context deadline exceeded"
E0122 15:24:25.667039       1 main.go:68] "Failed to establish connection to CSI driver" err="context deadline exceeded"
E0122 15:24:55.668143       1 main.go:68] "Failed to establish connection to CSI driver" err="context deadline exceeded"
I0122 15:22:07.346639       1 main.go:154] "Version" version="v2.15.0"
I0122 15:22:07.346713       1 main.go:155] "Running node-driver-registrar" mode=""
I0122 15:22:07.346717       1 main.go:176] "Attempting to open a gRPC connection" csiAddress="/csi/csi.sock"
I0122 15:22:11.205084       1 main.go:184] "Calling CSI driver to discover driver name"
I0122 15:22:11.207641       1 main.go:193] "CSI driver name" csiDriverName="nfs.csi.k8s.io"
I0122 15:22:11.207679       1 node_register.go:56] "Starting Registration Server" socketPath="/registration/nfs.csi.k8s.io-reg.sock"
I0122 15:22:11.207834       1 node_register.go:66] "Registration Server started" socketPath="/registration/nfs.csi.k8s.io-reg.sock"
I0122 15:22:11.207909       1 node_register.go:96] "Skipping HTTP server"
I0122 15:24:55.935771       1 nfs.go:90] Driver: nfs.csi.k8s.io version: v4.12.1
I0122 15:24:55.936331       1 nfs.go:147]
DRIVER INFORMATION:
-------------------
Build Date: "2025-10-13T14:06:17Z"
Compiler: gc
Driver Name: nfs.csi.k8s.io
Driver Version: v4.12.1
Git Commit: ""
Go Version: go1.24.6
Platform: linux/amd64

Streaming logs below:
I0122 15:24:55.940258       1 mount_linux.go:334] Detected umount with safe 'not mounted' behavior
I0122 15:24:55.940612       1 server.go:117] Listening for connections on address: &net.UnixAddr{Name:"//csi/csi.sock", Net:"unix"}
```
---
the following logs lines are missing:
```
I0122 10:45:05.376383       1 main.go:99] "Received GetInfo call" request=""                                                                         
I0122 10:45:05.407652       1 main.go:111] "Received NotifyRegistrationStatus call" status="plugin_registered:true"
```
on the bottlerocket nodes of such cases, we're seeing that:
csi.sock exists in the plugin dir
```
bash-5.1# ls -la /var/lib/kubelet/plugins/csi-nfsplugin/
total 0
drwxr-xr-x. 2 root root  22 Jan 22 10:45 .
drwxr-xr-x. 7 root root 108 Jan 22 10:13 ..
srwxr-xr-x. 1 root root   0 Jan 22 10:45 csi.sock
```
however, 
can't find `nfs.csi.k8s.io-reg.sock` in the plugin registry dir
```
bash-5.1# ls -la /var/lib/kubelet/plugins_registry/
total 4
drwxr-x---.  2 root root   38 Jan 22 09:51 .
drwxr-xr-x. 11 root root 4096 Jan 22 10:10 ..
srwx------.  1 root root    0 Jan 22 09:51 ebs.csi.aws.com-reg.sock
```

it works fine on restart, however the restart should be automated somehow and node_register shouldn't be starting up if it hasn't successfully registered the plugin. we're unsure why this is happening in our qa eks cluster. doesn't happen at all in our prod clusters and they're both running the same version of nfs csi drivers and the same version of eks with the same helm configuration.

**Environment**:

- CSI Driver version: v4.12.1
- Kubernetes version (use `kubectl version`): 
```
Client Version: v1.34.1
Kustomize Version: v5.7.1
Server Version: v1.33.5-eks-3025e55
```
- OS (e.g. from /etc/os-release): bottlerocket (bottlerocket-aws-k8s-1.33-x86_64-v1.41.0-bc3ad241)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

driver name nfs.csi.k8s.io not found in the list of registered CSI drivers #1025

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

driver name nfs.csi.k8s.io not found in the list of registered CSI drivers #1025

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions