You are not logged in.
Hi Arnaud,
I am currently leveraging the excellent TWebSocketAsyncServer in mORMot 2 for a high-throughput application involving massive broadcasting from server to clients.
While implementing the asynchronous transmission, I noticed that under heavy network congestion or with slow clients, the outbound data can accumulate significantly. To ensure server stability and avoid excessive memory consumption (or potential contention during WaitLock), I am looking for the "mORMot-way" of detecting the optimal moment to push more data into the websocket.
Specifically:
Is monitoring TPollAsyncConnection.fWr.Len the recommended approach to implement a backpressure mechanism, or is there a higher-level event/callback in the TWebSocketAsyncProcess that I should use to detect when the outgoing buffer has reached a safe threshold?
In the context of THttpAsyncServer, would you recommend manually dropping frames when the outbound buffer exceeds a certain limit, or is there a built-in gathering strategy (like SendDelay) that I should tune to handle thousands of concurrent massive sends more gracefully?
I want to avoid any "freezing" effects caused by RAM saturation or lock contention when the kernel's TCP buffers are full.
Thank you for your tireless work on this amazing framework.
Best regards
Offline
Dropping frames could be an option, but it depends on your actual use-case.
For instance, compressed video content flow may accept or not accept frame drops, depending on the algorithm.
Or you could use a specific "flush" message when RAM is saturated.
But in this case, it could be handy to flush the data pending in the mormot async server buffers, because it is likely to be deprecated. I am not sure how to flush kernel's TCP buffers anyway...
Of course, for a websockets server, flushing data should perhaps be done at the websockets frames layer, not in the TCP layer, to avoid any broken frame content...
Offline
Hi Arnaud,
Thank you for the insights.
Actually, dropping frames or flushing the buffers is exactly what I would like to avoid. My goal is to ensure data integrity without losing any frames, even when the RAM reaches a critical point.
Instead of discarding data, I am looking for a way to implement proper backpressure within the mORMot 2 async server architecture. My intention is to have the server 'wait' or slow down the injection of new frames until the current buffers are successfully transmitted and acknowledged at the TCP/Websocket layer.
Is there a recommended way in mORMot 2 to monitor the state of the async output buffers or the underlying socket readiness, so I can pause the data flow instead of saturating the RAM or being forced to flush pending data?
Offline
But how do you pause the data flow, if you can't send any data?
Anyway, perhaps retrieving the number pending bytes in the connection output buffer may help you:
https://github.com/synopse/mORMot2/commit/45162a1fc
Offline
Hi Arnaud,
Thank you very much for the prompt update! The PendingWrite property is exactly what I needed to implement a smarter flow control.
Regarding your question on how to pause the flow: my plan is to use a threshold-based approach.
Since I am streaming frames (like video or large DICOM datasets), I can now check PendingWrite before injecting the next chunk into the async server. If PendingWrite exceeds a predefined safety limit (e.g., 2MB), I will pause the frame producer (the source) and wait for the buffer to drain below a certain 'low-water mark' before resuming.
This way, I can leverage mORMot's high-performance async architecture without overwhelming the server's RAM when the network throughput is slower than the data production rate.
I really appreciate the support and the commitment to mORMot 2!
Offline
Hi Arnaud,
Thank you again for the PendingWrite implementation in the async server; it works perfectly for managing backpressure on the server side.
Following that logic, I am now tuning the Client Side (high-throughput frame injection). I noticed that when a client sends a massive amount of data in a loop, it can still hit a "blocking" state or saturate the local kernel buffers if the producer is faster than the network, sometimes leading to local freezing if not handled with manual Sleep() calls.
Is there (or could there be) a similar way to monitor the outbound buffer state on the Client classes (e.g., TWebSocketClient or the underlying TRawSocket)?
Having a Client.PendingWrite (or a way to check if the outgoing TCP window is full) would allow us to implement the same "threshold-based pause" on the client, ensuring a graceful flow control without relying on arbitrary Sleep() intervals.
What do you think about exposing this metrics-gathering for client-side sockets as well?
Best regards,
ec
Offline
Hi Arnaud,
I've been stress-testing the PendingWrite implementation on the server side, and it is a game-changer for stability. It allows a very graceful flow control for our DICOM streaming.
However, I've hit a 'mirror' issue on the Client Side (Delphi Client injecting frames to Server). When the client producer is faster than the uplink, we lack the same visibility. Without a way to check the client's outbound buffer, the application eventually blocks or consumes RAM unpredictably, and using fixed Sleep() calls is suboptimal for performance.
Since TRawSocket and the async engine already handle these buffers, would it be possible to expose PendingWrite (or an equivalent 'low-level' bytes-pending metric) for the Client classes as well?
Having this symmetry between Client and Server would allow us to implement a unified backpressure strategy across the entire mORMot 2 ecosystem.
What are your thoughts on making this metric available for THttpClientWebSockets or its underlying socket layer?
Offline
I guess you are speaking about THttpClientWebSockets.
And that you send frames using TWebCrtSocketProcess.SendFrame(), right?
There is no buffer on the client side.
SendFrame() just call SendBytes() and wait for the OS buffer to have finished.
So there is no way to know more about the client state, I am afraid, as we do with the async server.
What you could do is measure the time taken by SendFrame() e.g. using QueryPerformanceMicroSeconds() since GetTickCount64 has a too low refresh rate (updated every 16ms on Windows).
If it is blocking (method call took more than a few ms) then you may need to relax your sending.
Offline
Thank you for the clarification. You are right; I am using TWebCrtSocketProcess.SendFrame() on the client side, and I now understand that it blocks once the OS TCP buffer is full.
Measuring the execution time with QueryPerformanceMicroSeconds() is a good suggestion for a reactive approach. However, to avoid the 'blocking' state altogether and keep the UI or other tasks fluid, would it be feasible to check the socket's write readiness before calling SendFrame()?
On the client side, could we use something like fSocket.WaitFor(1, [neWrite]) to see if the kernel is ready to accept more data without blocking?
If fSocket (TCrtSocket) could expose a way to check if send() would block, we could implement the same threshold-based logic we now have on the server, but based on 'Socket Readiness' instead of 'Pending Bytes'.
What do you think about this 'proactive' check for the client-side TCrtSocket?
Offline
You can check the socket's write readiness but it is likely to be always ready, because the socket it blocking. fSocket.WaitFor(1, [neWrite]) will always return [].
There is no OS API to check for the current number of bytes in the output buffers.
Only SO_SNDBUF option to return the buffer size.
Even if you knew the exact buffer occupancy, it still wouldn’t perfectly predict blocking:
- TCP flow control and congestion control change dynamically.
- The kernel may accept data into the buffer and later stall.
- “Writable” does not guarantee a large write will succeed.
So unless you switch to non-blocking mode, there is no easy way of knowing this information.
And in non-blocking mode, you will have almost the same information than check actual QueryPerformanceMicroSeconds().
I could add a "wait counter" to SendFrame/TrySndBuf but it won't be perfect either.
Offline
Thanks for the detailed explanation. It makes perfect sense.
Since I am already working with an event-driven model, I will follow your advice and switch to non-blocking mode. This will allow me to properly handle flow control on the application side without stalling the execution thread when the OS buffers or TCP window are full.
Instead of trying to predict the buffer occupancy, I'll rely on the write readiness events and handle WSAEWOULDBLOCK (or equivalent) by queuing the remaining data in user-space.
Regarding the 'wait counter' in SendFrame/TrySndBuf, don't worry about it for now. Moving to a true non-blocking flow is a cleaner architectural solution for my current needs.
Offline
There is an async client class in mormot.net.async.
But it does make sense only for a lot of simultaneous clients in the same process - it may not be the case for you.
Anyway, writing a non-blocking client is not a simple task.
So I have added a new TCrtSocket.RetryCount property to check if nrRetry (i.e. cross-platform WSAEWOULDBLOCK) has been triggered.
See https://github.com/synopse/mORMot2/commit/7aff391b3
Offline
Hi Arnaud,
Thank you for the quick and insightful feedback.
That makes perfect sense. My use case doesn't involve thousands of simultaneous clients, so the mormot.net.async complexity would indeed be overkill.
The new TCrtSocket.RetryCount property is exactly what I needed to safely check for nrRetry / WSAEWOULDBLOCK without overhead. I have already updated my local mORMot 2 files to include your latest commit and I am adjusting my send logic to use it.
I appreciate you taking the time to add this property to the framework. It makes the flow control much cleaner for my implementation.
Offline