#1 2024-12-09 15:33:05

mormot.coremy_only_lonely
Member
Registered: 2024-12-09
Posts: 4

Continue discusstion Github issue #321 (Race condition in THttpAsyncS)

The Github issue was closed. But I still think there are some race conditions.

https://github.com/synopse/mORMot2/issues/321

Last edited by mormot.coremy_only_lonely (2024-12-09 15:34:36)

Offline

#2 2024-12-09 16:58:52

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,745
Website

Re: Continue discusstion Github issue #321 (Race condition in THttpAsyncS)

TL;WR: some potential race conditions have been identified in the async web server.

The most obvious have been identified and fixed.
But there still may be some potential dangling pointers access, so some GPF on production, in very specific cases (e.g. if the server is bloated and has no CPU power any more).

I would like to rewrite the whole connection object instances lifetime, and get rid of the "dual generation GC" trick we use.
The idea may be to use a single lock per connection (not only read or write), then let each thread acquire the connection instance for a while. Then release it with no dangling pointer.
A fully deterministic web server is better and safer.

My current issue is about performance: we store the connection pointer with the socket-tracking API, so a safer approach may be to use the "connection id sequence" instead - whereas it may be slower because we need to search for it in the connection list... we use a branchless O(log(n)) binary search for that, my guess is that a hash table may be slower. Using directly the connection pointer is O(1) and with no performance penalty... we have to ensure that no socket is still in the socket-tracking API before the connection (and its socket) are freed (and closed). And on some systems, the API can't allow that in a lockless manner.
We may need to monitor and measure if O(log(n)) is really too slow. And store the connection according to the API abilities: e.g. for epoll, which is lockless, store pointers; but use connection ID otherwise. It may be the safest and easiest path...
Any input is welcome!

Offline

#3 2024-12-11 02:47:30

mormot.coremy_only_lonely
Member
Registered: 2024-12-09
Posts: 4

Re: Continue discusstion Github issue #321 (Race condition in THttpAsyncS)

That's great!
My thought is always, correctness first, performance later.
This could also happen when only one or some cores are 100% usage, not all cores.

Offline

#4 2024-12-11 10:17:30

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,745
Website

Re: Continue discusstion Github issue #321 (Race condition in THttpAsyncS)

If some cores are at low usage rate, then the OS will use them to run our web server process, so we won't reach the GC timeout of 2 seconds.

Offline

#5 2024-12-11 16:02:44

mormot.coremy_only_lonely
Member
Registered: 2024-12-09
Posts: 4

Re: Continue discusstion Github issue #321 (Race condition in THttpAsyncS)

But switching cores might not be an instant behavior.

Offline

#6 2024-12-18 14:04:29

mormot.coremy_only_lonely
Member
Registered: 2024-12-09
Posts: 4

Re: Continue discusstion Github issue #321 (Race condition in THttpAsyncS)

Do we have any progress on this? I see some related commits, but the issue we discussed is not fixed.

Last edited by mormot.coremy_only_lonely (2024-12-31 09:44:24)

Offline

Board footer

Powered by FluxBB