-
-
Notifications
You must be signed in to change notification settings - Fork 3.1k
Description
Checklist
- This is a bug report, not a question. Ask questions on discuss.ipfs.io.
- I have searched on the issue tracker for my bug.
- I am running the latest kubo version or have an issue updating.
Installation method
built from source
Version
Compiled from tag v0.21.0-rc1 with Go 1.20.5:
Kubo version: 0.21.0-rc1
Repo version: 14
System version: amd64/linux
Golang version: go1.20.5
Config
# Modified as such:
ipfs config profile apply server
ipfs config --bool 'Swarm.ResourceMgr.Enabled' false
ipfs config --json 'Swarm.ConnMgr' '{
"GracePeriod": "0s",
"HighWater": 100000,
"LowWater": 0,
"Type": "basic"
}'
ipfs config --bool 'Swarm.RelayService.Enabled' falseDescription
I'm that guy running https://grafana.monitoring.ipfs.trudi.group
This is our setup.
In particular, we run two daemons in docker-compose, see here.
The images are build using this Dockerfile
and configured using this script.
I recently moved from v0.18.1 to v0.21.0-rc1. I did not change the config mods I have been running before. We have a plugin to export Bitswap messages and information from the Peerstore (this is called every few minutes by an external client). We also export information about the peer store to Prometheus, see here.
It's mostly running fine, although with fewer connections, but that's probably just a question of time.
I noticed, however, that I'm approaching 1M goroutines per daemon, which is quite a bit more than before, see here.
I believe this might be connected to the number of inbound /ipfs/id/push/1.0.0 streams I have, see here.
Interestingly, the (linear with time) rise in inbound streams does not happen immediately when we start the daemons, and not at the same time for both daemons, although the were started within seconds of each other, see this graph. The second daemon follows a few hours later. Because the symptoms don't show up at the same time in both daemons, it doesn't feel like this is directly related to our regular data exports. It feels more like some concurrency bug in kubo that shows up only after a while. This is the graph in question, in case Grafana doesn't work:

The daemons did not restart in between (there's a panel for that somewhere).
Not too sure what's going on here. Let me know if I can help debug. I wonder if this is related to how we're exporting data from the Peerstore -- we're only using public functionality, was there some API change I missed, some cleanup or something? I will try running without our client for a while.