- 
                Notifications
    
You must be signed in to change notification settings  - Fork 13
 
Description
So, following the information on #533, I setup the timeouts for the 4 sfsmaster nodes to create a priority system.
When one of the was elected leader, sfsmaster started and got into a loop spitting out this:
Aug 29 15:10:58 sfo-storage-server sfsmaster[33351]: [33351] info: connecting to Master
Aug 29 15:10:58 sfo-storage-server sfsmaster[33351]: [33351] info: connected to Master
Aug 29 15:10:58 sfo-storage-server sfsmaster[33351]: [33351] info: main master server module: got invalid message in shadow state (type:400)
Aug 29 15:10:58 sfo-storage-server sfsmaster[33351]: [33351] info: main master server module: got invalid message in shadow state (type:400)
Aug 29 15:10:58 sfo-storage-server sfsmaster[33351]: [33351] info: main master server module: got invalid message in shadow state (type:400)
Aug 29 15:10:58 sfo-storage-server sfsmaster[33351]: [33351] info: main master server module: got invalid message in shadow state (type:400)
Aug 29 15:10:58 sfo-storage-server sfsmaster[33351]: [33351] info: main master server module: got invalid message in shadow state (type:400)
Aug 29 15:10:58 sfo-storage-server sfsmaster[33351]: [33351] info: main master server module: got invalid message in shadow state (type:400)
Aug 29 15:10:59 sfo-storage-server sfsmaster[33351]: [33351] info: connecting to Master
Aug 29 15:10:59 sfo-storage-server sfsmaster[33351]: [33351] info: connected to Master
Aug 29 15:10:59 sfo-storage-server sfsmaster[33351]: [33351] info: main master server module: got invalid message in shadow state (type:400)
Aug 29 15:10:59 sfo-storage-server sfsmaster[33351]: [33351] info: main master server module: got invalid message in shadow state (type:400)
Aug 29 15:11:00 sfo-storage-server sfsmaster[33351]: [33351] info: connecting to Master
Aug 29 15:11:00 sfo-storage-server sfsmaster[33351]: [33351] info: connected to Master
Aug 29 15:11:00 sfo-storage-server sfsmaster[33351]: [33351] info: main master server module: got invalid message in shadow state (type:400)
Aug 29 15:11:00 sfo-storage-server sfsmaster[33351]: [33351] info: main master server module: got invalid message in shadow state (type:400)
Aug 29 15:11:00 sfo-storage-server sfsmaster[33351]: [33351] info: main master server module: got invalid message in shadow state (type:400)
Aug 29 15:11:00 sfo-storage-server sfsmaster[33351]: [33351] info: main master server module: got invalid message in shadow state (type:400)
Aug 29 15:11:00 sfo-storage-server sfsmaster[33351]: [33351] info: main master server module: got invalid message in shadow state (type:400)
Aug 29 15:11:00 sfo-storage-server sfsmaster[33351]: [33351] info: main master server module: got invalid message in shadow state (type:400)
Aug 29 15:11:01 sfo-storage-server sfsmaster[33351]: [33351] info: connecting to Master
Aug 29 15:11:01 sfo-storage-server sfsmaster[33351]: [33351] info: connected to Master
All mount points got frozen since the unique IP was deleted, but wasn't assigned to the new leader since sfsmaster was stuck in a loop.
After manually killing sfsmaster, it came up online correctly, and the node with the second priority became the leader and saunafs was back online on all clients.
But now, this node that got the error nessages doesn't show up in the saunafs webui.
I had to restart uraft for it to finally show up in the webui again.
I've created a per minute job to monitor this message every minute in all 4 nodes, and if happens again, it will restart uraft to hopefully prevent this from happening again.
I'm just not sure what happened there.
I attached the full sfsmaster log since it was started.
