Skip to content

Set a more sane timeout for WS connections and log WS errors#1992

Merged
mcm001 merged 1 commit intoPhotonVision:mainfrom
Gold856:fix-weird-ui-issues
Jul 9, 2025
Merged

Set a more sane timeout for WS connections and log WS errors#1992
mcm001 merged 1 commit intoPhotonVision:mainfrom
Gold856:fix-weird-ui-issues

Conversation

@Gold856
Copy link
Member

@Gold856 Gold856 commented Jul 7, 2025

Description

As described in #1827 (comment), setting the idle timeout to 5 seconds, as done here, seems to fix the WebSockets connection dying "permanently." Now, Javalin will throw a TimeoutException after a short time, after which the WebSockets connection becomes functional again. The behavior is a little strange though; disconnecting my Ethernet cable, reconnecting it to get the same IP address, then opening the page will sometimes work, and continue to work after refreshing multiple times, until one refresh will show the broken UI briefly, before it starts working again. Or it might be time based and refreshes have nothing to do with it, unsure. Regardless, the logs will record a new WebSocket connection, the TimeoutException getting thrown, then the closing of the WebSockets connection with a UUID. The timeout does seem to help; during one testing session, the placeholder UI showed up after a refresh, stayed visible for multiple seconds, and then everything loaded in; the exception was thrown right around the same time the UI came back up, which gives me confidence that the timeout will allow PV to eventually recover without needing a reboot even in scenarios where the WebSocket connection might be unstable. It does appear that the connection was simply being kept open for too long, causing issues when a new one was made with the same IP address, and simply allowing it to close fixes it. I personally tested quite a few disconnect-reconnect loops, and reproed multiple times on main, and zero times with this branch where the UI didn't recover.

I also want to note that when the connection is severed and left alone, the TimeoutException is not thrown after 5 seconds, but instead seems to take upwards of a minute. I'm unsure as to why, but I assume it's something deeper in the networking stack. I've also seen ClosedChannelException get thrown instead of TimeoutException, which appears to be normal according to Javalin.

I've also added logging for when WebSocket errors are thrown. I believe the exceptions are thrown anyways by DataChangeService, but they seem a bit different, and more details is always better.
Closes #1827, closes #1320, closes #1182.

Meta

Merge checklist:

  • Pull Request title is short, imperative summary of proposed changes
  • The description documents the what and why
  • If this PR changes behavior or adds a feature, user documentation is updated
  • If this PR touches photon-serde, all messages have been regenerated and hashes have not changed unexpectedly
  • If this PR touches configuration, this is backwards compatible with settings back to v2024.3.1
  • If this PR touches pipeline settings or anything related to data exchange, the frontend typing is updated
  • If this PR addresses a bug, a regression test for it is added

@Gold856 Gold856 requested a review from a team as a code owner July 7, 2025 08:26
@mcm001 mcm001 merged commit 78f5760 into PhotonVision:main Jul 9, 2025
39 checks passed
@Gold856 Gold856 deleted the fix-weird-ui-issues branch July 10, 2025 03:03
@Gold856 Gold856 added the backend Things relating to photon-core and photon-server label Aug 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend Things relating to photon-core and photon-server

Projects

None yet

2 participants