What would you like to see?
When in an agent chat session it is possible that during work on the server the websocket connection could go a very long time between messages.
For example, consider a connection where the model needs to call some very computationally heavy tool or maybe the model itself needs time for processing. This means the user could see the initial response from the socket then wait a long time before the next packet comes in.
For some tunnels like ngrok or zrok you cannot control the TTL of the tunnel and it can close after some period of time with no traffic (eg: 60s).
If anythingllm sent a simple alive packet to the frontend and did nothing with it we could keep the tunnel open with no side effects.
This hearbeat must still respect the total socket TTL or else they will never close.
From: https://www.reddit.com/r/LocalLLM/comments/1kgbnr7/comment/omy5jb0/?utm_source=share&utm_medium=web3x&utm_name=web3xcss&utm_term=1&utm_content=share_button
What would you like to see?
When in an agent chat session it is possible that during work on the server the websocket connection could go a very long time between messages.
For example, consider a connection where the model needs to call some very computationally heavy tool or maybe the model itself needs time for processing. This means the user could see the initial response from the socket then wait a long time before the next packet comes in.
For some tunnels like ngrok or zrok you cannot control the TTL of the tunnel and it can close after some period of time with no traffic (eg: 60s).
If anythingllm sent a simple
alivepacket to the frontend and did nothing with it we could keep the tunnel open with no side effects.This hearbeat must still respect the total socket TTL or else they will never close.
From: https://www.reddit.com/r/LocalLLM/comments/1kgbnr7/comment/omy5jb0/?utm_source=share&utm_medium=web3x&utm_name=web3xcss&utm_term=1&utm_content=share_button