Posts by wsherman

wsherman · mORMot 2

I ran across this short article on phoronix about recent performance testing by the Rustls folks:

Rustls Multi-Threaded Performance Is Battering OpenSSL
https://www.phoronix.com/news/Rustls-Mu … ading-Perf

Details on Rustls vs BoringSSL vs OpenSSL performance are here:

Measuring and Improving rustls's Multithreaded Performance
https://rustls.dev/perf/2024-11-28-threading/

wsherman · mORMot 1

ab wrote:

You are right, I perfectly know those citations.
In practice, we don't see our locks going to call the OS...
Spining in user space is not bad in itself...libpthread does it.

Yes, one of the posts I referenced mentions that mutex is implemented in a similar way:

If Not a Spinlock, Then What?
First, if you only use a spin lock because "it’s faster for small critical sections", just replace it with a mutex from std or parking_lot. They already do a small amount of spinning iterations before calling into the kernel, so they are as fast as a spinlock in the best case, and infinitely faster in the worst case.

wsherman · mORMot 1

ab wrote:

Our locks only spin with a small number of iterations...They are meant to be almost always without contention, and have a content of a few cycles to protect

I am not saying you have a problem since I don't know anything about your implementation, and I am not trying to be argumentative. I am just putting this out there, and it may or may not apply:

https://matklad.github.io//2020/01/02/s … rmful.html

...
Spinning Just For a Little Bit, What Can Go Wrong?
Because spin locks are so simple and fast, it seems to be a good idea to use them for short-lived critical sections. For example, if you only need to increment a couple of integers, should you really bother with complicated syscalls? In the worst case, the other thread will spin just for a couple of iterations…
Unfortunately, this logic is flawed! A thread can be preempted at any time, including during a short critical section. If it is preempted, that means that all other threads will need to spin until the original thread gets its share of CPU again. And, because a spinning thread looks like a good, busy thread to the OS, the other threads will spin until they exhaust their quants, preventing the unlucky thread from getting back on the processor!

wsherman · mORMot 1

ab wrote:

in mORMot 2, we introduced several native locks in addition to those OS locks

Be aware that there are many warnings about using spin-locks in user space on Linux.
References:

https://www.realworldtech.com/forum/?th … tid=189723

Linus wrote:

Because you should never ever think that you're clever enough to write your own locking routines.. Because the likelihood is that you aren't (and by that "you" I very much include myself - we've tweaked all the in-kernel locking over decades, and gone through the simple test-and-set to ticket locks to cacheline-efficient queuing locks, and even people who know what they are doing tend to get it wrong several times).

https://man7.org/linux/man-pages/man3/p … html#NOTES

User-space spin locks are not applicable as a general locking
solution. They are, by definition, prone to priority inversion
and unbounded spin times. A programmer using spin locks must be
exceptionally careful not only in the code, but also in terms of
system configuration, thread placement, and priority assignment.

https://news.ycombinator.com/item?id=21970050

You can implement spinlocks in userspace under specific circumstances. You will need to mark your threads as realtime threads[0] and have a fallback to futex if the fast path (spinning) doesn't work out. And even then you need to benchmark on multiple machines and with realistic workloads (not microbenchmarks) to see whether it's actually worth it for your application. It's complex...

https://mjtsai.com/blog/2020/01/06/bewa … ser-space/

How do you even measure whether a spinlock is better than a mutex, and what makes a good spinlock?

https://www.realworldtech.com/forum/?th … tid=189747

Before my blog post I had a very hard time finding guidance that actually backed up the claim "you shouldn't use spinlocks" with benchmarks. Part of my goal was to provide that.

Much of this discussion was started from this post (also see comments at the bottom):
https://probablydance.com/2019/12/30/me … really-is/

from: https://github.com/synopse/mORMot2/blob … 3048-L3049

/// could be set to TRUE to force SleepHiRes(0) to call the sched_yield API
// - in practice, it has been reported as buggy under POSIX systems

Maybe "buggy" is not the right term. The API behavior is not well suited for this type of usage (it is too naive since the scheduler doesn't know which thread needs to wake up to release the lock).

mORMot Open Source

#1 Re: mORMot 2 » OpenSSL 1.1.1 End Of Live » 2024-12-09 01:40:10

#2 Re: mORMot 1 » Three Locks To Rule Them All » 2022-01-24 21:40:59

#3 Re: mORMot 1 » Three Locks To Rule Them All » 2022-01-24 21:33:08

#4 Re: mORMot 1 » Three Locks To Rule Them All » 2022-01-24 18:35:05

Board footer