High-Performance Frameworks

ab · 2024-05-29 14:17:32

The ByteScanIndex() is using Agner Fog's assembly code.
It reads at aligned 16 bytes, and ignore the out-of-range bytes.
This is a pretty valid scheme in practice, but I understand it does not please ValGrid.

Please try with
https://github.com/synopse/mORMot2/commit/ba84f2d0

But I don't know with TDYNARRAYHASHER_
Could you try without the hsoHeadersInterning option in the THttpAsyncServer?
This is the only place I could see use of TDynArrayHasher.
Or please try with https://github.com/synopse/mORMot2/commit/d6b92b1f

During my tests, I also discovered that shutdown may block if several servers are listening (noport).
I will investigate.

mpv · 2024-05-30 09:06:26

On May 29 commit
- with hsoHeadersInterning warnings remains the same
- without hsoHeadersInterning only one remain

==1532807== Conditional jump or move depends on uninitialised value(s)
==1532807==    at 0x48881F: MORMOT.CORE.BASE_$$_BYTESCANINDEX$PBYTEARRAY$INT64$BYTE$$INT64 (mormot.core.base.asmx64.inc:2091)
==1532807==    by 0x6BA141: MORMOT.NET.HTTP$_$THTTPREQUESTCONTEXT_$__$$_PROCESSREAD$TPROCESSPARSELINE$BOOLEAN$$BOOLEAN (mormot.net.http.pas:3693)

About "shutdown may block" - from my experience it not block, but sometimes wait too long

Last edited by mpv (2024-05-30 09:07:19)

mpv · 2024-05-30 11:49:10

Made a PR 9072:
- migrate to Ubuntu24.04;
- mORMot@2.2.7565;
- return to glibc MM;
- plaintext is run after cached-queries

ab · 2024-05-30 12:10:37

It is weird, because _BYTESCANINDEX should be fixed by allocating the memory buffer size + APPEND_OVERLOAD = 24 in TRawByteStringBuffer.RawAppend.
And our ByteScanIndex() has in fact the same logic as in FPC RTL IndexByte(). Just a bit more optimized, e.g. with proper opcode fusing. But it is the very same idea about aligned 16-bytes reads.

About hsoHeadersInterning, I will try to investigate a bit more but I did not see anything wrong with how TDynArrayHasher works.
Edit: I have added some hardcoded checks, and was NOT able to find anything wrong about indexes, as reported by valgrid:

function TDynArrayHasher.HashTableIndexToIndex(aHashTableIndex: PtrInt): PtrInt;
begin
  if ((hash16bit in fState) and
      (PtrUInt(aHashTableIndex) >= PtrUInt(length(fHashTableStore)) * 2)) or
     ((not (hash16bit in fState) and
      (PtrUInt(aHashTableIndex) >= PtrUInt(length(fHashTableStore))))) then
      RaiseFatalCollision('HashTableIndexToIndex', aHashTableIndex);
  result := PtrUInt(fHashTableStore);
  ...

mpv · 2024-06-01 13:34:49

I also do not see any problem there - may be this is false-positive warning. And I can`t reproduce crush on my environment
On laet run (with mimalloc) failure message is

mimalloc: assertion failed: at "./src/alloc.c":231, mi_page_usable_size_of
assertion: "ok"

assertion source - https://github.com/microsoft/mimalloc/b … loc.c#L231

P.S. Real assertion error is here - https://github.com/microsoft/mimalloc/b … loc.c#L223

P.P.S.
a test case where we fails (may be it helps to reproduce crash on your side):

wrk -H 'Host: 10.0.1.1' -H 'Accept: text/plain,text/html;q=0.9,application/xhtml+xml;q=0.9,application/xml;q=0.8,*/*;q=0.7' -H 'Connection: keep-alive' --latency -d 15 -c 256 --timeout 8 -t 56 http://10.0.1.1:8080/plaintext -s pipeline.lua -- 16

A pipeline.lua is here - https://github.com/TechEmpower/Framewor … peline.lua

Last edited by mpv (2024-06-01 13:47:03)

ab · 2024-06-01 19:43:54

Sounds like a canary overwrite. That is, some buffer overflow.
Pretty weird. Because in FPC heaptrc does not trigger it.

Edit 1:
I was not able to reproduce the problem either here.
Perhaps we may disable the msize() function in mormot.core.fcplibcmm, and let _MemSize() always return 0.
I have added the new FPC_LIBCMM_NOMSIZE conditional, to be used together with FPC_LIBCMM.
https://github.com/synopse/mORMot2/commit/360d7835

Edit 2:
We use -O4 to compile the executable and I doubt it is worth it.
I only always use and validate -O3 level, and optimize the pascal code to have the best generated asm at -O3.
From -O4 may come some instability, with potentially no performance benefit.

Edit 3:
But I may have an hint about what happens. My current guess is that the issue could be about interning the "Host" and "Accept" header values.
I will investigate.
Edit 3-followup: Found nothing after investigation. There is no specific memory allocation due to those additional headers.

mpv · 2024-06-03 14:56:33

Updated PR 9072:
- migrate to Ubuntu24.04;
- mORMot@2.2.7579;
- return to glibc MM;
- plaintext is run after cached-queries
- define FPC_LIBCMM_NOMSIZE to disable the malloc_usable_size() call on Linux
- use O3 compiler optimization level

Hope that FPC_LIBCMM_NOMSIZE or -O3 will help

Last edited by mpv (2024-06-03 14:57:08)

ab · 2024-06-04 09:45:11

With the current round, we got very good results, but... /plaintext failed
https://www.techempower.com/benchmarks/ … =plaintext

We will see with your PR.

mpv · 2024-06-08 10:54:10

In previous run (2d run with mimalloc) where is new warnings and error

mormot: mimalloc: warning: mi_free: pointer might not point to a valid heap region: 0x7c5bc8bd9040
mormot: (this may still be a valid very large allocation (over 64MiB))
mormot: mimalloc: warning: mi_free: pointer might not point to a valid heap region: 0x7c5bc8be1040
mormot: (this may still be a valid very large allocation (over 64MiB))
mormot: mimalloc: warning: mimalloc: error: mimalloc: warning: mi_free: pointer might not point to a valid heap region: 0x7c5bb689a040
mormot: (this may still be a valid very large allocation (over 64MiB))
mormot: mimalloc: error: mi_free: pointer does not point to a valid heap space: 0x7c5bc8be1040
mormot: mi_free: pointer might not point to a valid heap region: 0x7c5bc8bd1040
mormot: (this may still be a valid very large allocation (over 64MiB))
mormot: mi_free: pointer does not point to a valid heap space: 0x7c5bc8bd9040
mormot: mimalloc: error: mi_free: pointer does not point to a valid heap space: 0x7c5bc8bd1040
mormot: mimalloc: warning: mi_free: pointer might not point to a valid heap region: 0x7c5bb68b2040
mormot: (this may still be a valid very large allocation (over 64MiB))
mormot: mimalloc: assertion failed: at "./include/mimalloc-internal.h":467, _mi_segment_page_of
mormot:   assertion: "idx < segment->slice_entries"
mormot: mimalloc: assertion failed: at "./include/mimalloc-internal.h":467, _mi_segment_page_of
mormot:   assertion: "idx < segment->slice_entries"
mormot: mimalloc: warning: mi_free: pointer might not point to a valid heap region: 0x7c5bb68a2040
mormot: (this may still be a valid very large allocation (over 64MiB))
mormot: mimalloc: warning: mi_free: pointer might not point to a valid heap region: 0x7c5bb68aa040
mormot: (this may still be a valid very large allocation (over 64MiB))
mormot: mimalloc: assertion failed: at "./include/mimalloc-internal.h":467, _mi_segment_page_of
mormot:   assertion: "idx < segment->slice_entries"

ab · 2024-06-17 16:35:30

@mpv

I guess I found the cause of our problem.
I was able to reproduce it with a Windows VM hosting a mORMot web server, and a Linux client running wrk requests.
Randomly, some connections instances may be freed twice, so GPF occurred after a successful run.
https://github.com/synopse/mORMot2/commit/a0dda418

mpv · 2024-06-17 16:41:13

WOW!!! That's just great!
I will immediately do PR for TFB.

mpv · 2024-06-17 17:20:11

PR 9121 is ready. If everything goes well, I'll bring '-O4' back in next PR - it makes tiny improvements

TPrami · 2024-06-18 08:14:21

mpv wrote:

PR 9121 is ready. If everything goes well, I'll bring '-O4' back in next PR - it makes tiny improvements

Please post link to results when available.

This is more interesting to follow than most movies on Netflix

sakura · 2024-06-18 09:55:31

ab wrote:

Randomly, some connections instances may be freed twice, so GPF occurred after a successful run.

I have seen that sometimes, I believe. Will check it out and watch it. Could never reproduce it in any sensible way. Let you know, if it seems better.

Thanks!

sakura · 2024-06-19 04:09:53

Nope, I still sometimes get the following error:

0000000000B012CF  " EXC   EInvalidPointer {Message:"Invalid pointer operation"} [TRestHttpSrv 443psyprax THttp] at 7c0784 System.pas @DynArrayClear (37156)

- usually when the browser tries to prefetch a petch, while the server is just starting. I'll keep trying to localize the problem for more information, but for now, I'm at a lost as to exactly why/when that happens.

ab · 2024-06-19 07:38:51

DynArrayClear() is called by the system e.g. when a class containing a dynamic array is released, or when a dynamic array on stack is released.
If it happens at startup, it may be due to some initialization code, which is not thread-safe.

You may enable the stack frames in your project option, to have a larger stack trace.

mpv · 2024-07-02 13:56:13

The round including PR 9121 started on 2024-07-01, so hopefully ~2024-07-04 we will complete plaintext without GPF

ab · 2024-07-04 15:00:01

I don't understand...
This round did not let /plaintext complete either...
We will see if we have additional information in the logs at the end of the round.

ab · 2024-07-13 14:59:32

New round, everything fine but pipelined /plaintext which still fails.

mpv · 2024-07-17 07:00:32

FYI: exception is

free(): invalid size

ab · 2024-07-17 07:23:31

With no stack trace it is very difficult to investigate...

TPrami · 2024-09-16 07:24:58

ab wrote:

With no stack trace it is very difficult to investigate...

Any luck in finding the bug?

-Tee-

ab · 2024-09-16 07:29:48

Since it was only in pipelined mode, which is never used on production, I honestly did not investigate much.

But I am more worried about the fact that Pavel did not send some news from Ukraine since a lot of time.

TPrami · 2024-09-16 09:58:08

ab wrote:

But I am more worried about the fact that Pavel did not send some news from Ukraine since a lot of time.

That is worrying indeed, hope he is OK.

-Tee-

dcoun · 2024-09-16 14:05:18

ab wrote:

But I am more worried about the fact that Pavel did not send some news from Ukraine since a lot of time.

I believe he is ok. Probably busy with UnityBase

mpv · 2024-09-18 14:51:01

Guys, thanks, I'm fine. It's just that time flies very strangely at war...
I will try to find the cause of GPF by switching FTB to mormot MM - PR 9283
If this does not help, we will disable HTTP pipelining in the next round

Last edited by mpv (2024-09-18 14:51:41)

mpv · 2024-09-27 11:49:24

We have the result for TFB with FPC_X64MM - it still fails for /plaintext, it also fails for /cachedqueries (ORM) and very low results for /fortunes. We'll have the log in a few days - maybe we'll see something interesting there...

mpv · 2024-09-30 10:17:44

Got a log: with FPC_X64MM /plaintext endpoint crashes with malloc(): unaligned tcache chunk detected - so, IMHO, we have corrupted or overwritten the memory allocator's internal metadata - may be some code in mORMot with direct memory access write outside a buffer..

ab · 2024-09-30 10:38:12

I don't understand because malloc(): unaligned tcache chunk detected is a malloc error, not a mormot.core.fpcx64mm error.
It sounds like if malloc is still used somewhere by the program.

The only place where I could see malloc still being used is in the PostgreSQL client library.
But it should not trash the /plaintext test, where PostgreSQL is not involved.

mpv · 2024-10-23 09:25:02

New PR 9350 to TFB with Mormot2@2.3.stable + CMem. Hope fixed TPollAsyncConnection locks structure commit solves mem problems

mpv · 2024-11-18 14:25:59

Because memory problems is not solved I tries *without* `hsoEnablePipelining` option, and got a strange result: when client query /plaintext in pipelining mode (see https://synopse.info/forum/viewtopic.ph … 769#p38769 for how to call) - server answer only to firs request in each thread. @ab - can you, please, look at this problem? Correct server behavior is "to respond to pipelined requests correctly, with non-pipelined but valid responses even if server does not support HTTP pipelining."

ab · 2024-11-18 15:10:02

I suspect this behavior has always been the case with the async server: pipelined requests were never taken into account, but in explicit hsoEnablePipelining.

And it seems that the mem problem occurred also outside of the pipeline /plaintext request, right?

danielkuettner · 2024-11-18 21:12:30

@ab Yes mem issue is outside of pipeline.
Today we had such an issue in our program (of course without pipeline):

SOneSrv2: malloc.c:4302: _int_malloc: Assertion `(unsigned long) (size) >= (unsigned long) (nb)' failed

We had no error in Postgres log at all. But we are using cmem.

Last edited by danielkuettner (2024-11-19 07:27:34)

ab · 2024-12-11 10:21:21

I have made some fixes last week, with a potential buffer overflow in CommandUri.
See https://github.com/synopse/mORMot2/comm … bea6b543b7

This was indeed a buggy optimization, just to avoid one string allocation, which may lead into buffer overflow, especially when a HTTP error is triggered and CommandUri is reused to compute the error the message.
TL;WR: never assume the usage on any string variable, and sub-allocate (i.e. reduce their length), but never over-allocate them.

@mpv @daniel
Perhaps we could try if it helps about the stability.

mpv · 2024-12-11 17:32:20

I made a TFB PR 9457 with latest changes (include TMultiLightLock)

ab · 2024-12-11 17:45:31

Oups.... there seems to be a regression about async PostgreSQL:

Verification failed: Command '['siege', '-c', '512', '-r', '1', 'http://tfb-server:8080/asyncqueries?queries=20', '-R', '/FrameworkBenchmarks/toolset/databases/.siegerc']' timed out after 20 seconds
   FAIL for http://tfb-server:8080/asyncqueries?queries=2

I suspect it could came from the refactoring done for https://github.com/synopse/mORMot2/issu … 2489112468

mpv · 2024-12-11 18:18:29

Yes, there is regression. I will fix the PR after you find the reason.

ab · 2024-12-12 10:20:43

Please try with
https://github.com/synopse/mORMot2/commit/ec4c095c

On my PC, I got again the expected results for /asyncqueries

mpv · 2024-12-12 13:10:37

Now it's OK - PR is updated

ab · 2024-12-20 07:35:02

The run did fail.

We will see the reported reason, but I guess the memory problem is still there.
I was convinced the "CommandUri" problem was it. But there is something else.
It seems to be about /plaintext only. All other tests did pass this time.

EDIT:
I have made a lot of changes, trying to cleaning up the code.
For instance, removing https://github.com/synopse/mORMot2/commit/7e3ce866
But on second thoughts, I am not sure what the root cause is...

EDIT 2:
Previous fixes were not enough. I was able to reproduce a memory corruption by running

$ wrk -c 16384 -t 8 http://localhost:8080/plaintext -d 8 -s pipeline.lua -- 16

EDIT 3:
Sounds like if it is a bug with fpcx64mm and medium blocks, because I can't reproduce it with the libc MM or the FPC heaptrc unit.

mpv · 2024-12-23 12:11:04

Are you planning any more improvements in the next few days? Or will I do an MR with current sources?

ab · 2024-12-23 13:34:15

I did not find anything new in the last days.

Maybe you can make a PR with today"s source, which has no trouble on my side, with the libc MM at least.

mpv · 2024-12-25 12:52:38

PR 9480 with mORMot@2.3.9262 2024-12-23
Merry Christmas!

ab · 2024-12-25 19:29:08

Merry Chrismas to you all !
May we all recognize the King of peace.

ab · 2025-01-01 09:23:11

The mORMot /plaintext just failed again, but the P&R was not included.
We will see next time.

pvn0 · 2025-01-02 08:50:11

Did it fail with libc mm or fpcx64mm?

ttomas · 2025-01-02 09:04:30

pvn0 wrote:

Did it fail with libc mm or fpcx64mm?

-dFPC_LIBCMM -dFPC_LIBCMM_NOMSIZE

pvn0 · 2025-01-02 19:33:20

I would suggest disabling optimizations and inlining for your next test, there are some wonky compiler bugs currently even in stable.

inlined function does not produce same result as normal function
Bit shift error, depending on optimization level

zen010101 · 2025-01-04 03:57:02

pvn0 wrote:

Bit shift error, depending on optimization level

I'm in a similar situation.

Using the latest version of FPC Trunk 3.3.1-17174-g72a3729ca0 [2024/12/31] for x86_64 to compile mormot2tests, TTestCoreBase.FastStringCompare fails to run. Specifically, the shr code of strspn is incorrectly optimized. Only adding {$OPTIMIZATION OFF} works, but this makes FastStringCompare test twice as fast.

ab · 2025-01-04 09:11:57

The TFB program is compiled with FPC 3.2.3 so is not affected by those FPC trunk (temporary) issues - at least on my machine.

mORMot Open Source

#301 2024-05-29 14:17:32

Re: High-Performance Frameworks

#302 2024-05-30 09:06:26

Re: High-Performance Frameworks

#303 2024-05-30 11:49:10

Re: High-Performance Frameworks

#304 2024-05-30 12:10:37

Re: High-Performance Frameworks

#305 2024-06-01 13:34:49

Re: High-Performance Frameworks

#306 2024-06-01 19:43:54

Re: High-Performance Frameworks

#307 2024-06-03 14:56:33

Re: High-Performance Frameworks

#308 2024-06-04 09:45:11

Re: High-Performance Frameworks

#309 2024-06-08 10:54:10

Re: High-Performance Frameworks

#310 2024-06-17 16:35:30

Re: High-Performance Frameworks

#311 2024-06-17 16:41:13

Re: High-Performance Frameworks

#312 2024-06-17 17:20:11

Re: High-Performance Frameworks

#313 2024-06-18 08:14:21

Re: High-Performance Frameworks

#314 2024-06-18 09:55:31

Re: High-Performance Frameworks

#315 2024-06-19 04:09:53

Re: High-Performance Frameworks

#316 2024-06-19 07:38:51

Re: High-Performance Frameworks

#317 2024-07-02 13:56:13

Re: High-Performance Frameworks

#318 2024-07-04 15:00:01

Re: High-Performance Frameworks

#319 2024-07-13 14:59:32

Re: High-Performance Frameworks

#320 2024-07-17 07:00:32

Re: High-Performance Frameworks

#321 2024-07-17 07:23:31

Re: High-Performance Frameworks

#322 2024-09-16 07:24:58

Re: High-Performance Frameworks

#323 2024-09-16 07:29:48

Re: High-Performance Frameworks

#324 2024-09-16 09:58:08

Re: High-Performance Frameworks

#325 2024-09-16 14:05:18

Re: High-Performance Frameworks

#326 2024-09-18 14:51:01

Re: High-Performance Frameworks

#327 2024-09-27 11:49:24

Re: High-Performance Frameworks

#328 2024-09-30 10:17:44

Re: High-Performance Frameworks

#329 2024-09-30 10:38:12

Re: High-Performance Frameworks

#330 2024-10-23 09:25:02

Re: High-Performance Frameworks

#331 2024-11-18 14:25:59

Re: High-Performance Frameworks

#332 2024-11-18 15:10:02

Re: High-Performance Frameworks

#333 2024-11-18 21:12:30

Re: High-Performance Frameworks

#334 2024-12-11 10:21:21

Re: High-Performance Frameworks

#335 2024-12-11 17:32:20

Re: High-Performance Frameworks

#336 2024-12-11 17:45:31

Re: High-Performance Frameworks

#337 2024-12-11 18:18:29

Re: High-Performance Frameworks

#338 2024-12-12 10:20:43

Re: High-Performance Frameworks

#339 2024-12-12 13:10:37

Re: High-Performance Frameworks

#340 2024-12-20 07:35:02