You are not logged in.
Because memory problems is not solved I tries *without* `hsoEnablePipelining` option, and got a strange result: when client query /plaintext in pipelining mode (see https://synopse.info/forum/viewtopic.ph … 769#p38769 for how to call) - server answer only to firs request in each thread. @ab - can you, please, look at this problem? Correct server behavior is "to respond to pipelined requests correctly, with non-pipelined but valid responses even if server does not support HTTP pipelining."
Perhaps a good way is to put the third-party samples in your own repository and link to it in ex/README.md (many well-known projects do this, calling it a curated list, etc.). In this case, the owner of the third-party repository can change their samples at their own discretion, and the only PR for the mORMot repository is changes to ex/README.md
New PR 9350 to TFB with Mormot2@2.3.stable + CMem. Hope fixed TPollAsyncConnection locks structure commit solves mem problems
So problem is not in HTTP pipelining.
Hope Daniel gives us a way to reproduce this issue
Got a log: with FPC_X64MM /plaintext endpoint crashes with malloc(): unaligned tcache chunk detected - so, IMHO, we have corrupted or overwritten the memory allocator's internal metadata - may be some code in mORMot with direct memory access write outside a buffer..
We have the result for TFB with FPC_X64MM - it still fails for /plaintext, it also fails for /cachedqueries (ORM) and very low results for /fortunes. We'll have the log in a few days - maybe we'll see something interesting there...
FYI: exception is
free(): invalid size
The easy way to use proxy with curl is to set the environment variables http_proxy and https_proxy
The round including PR 9121 started on 2024-07-01, so hopefully ~2024-07-04 we will complete plaintext without GPF
WOW!!! That's just great!
I will immediately do PR for TFB.
In previous run (2d run with mimalloc) where is new warnings and error
mormot: mimalloc: warning: mi_free: pointer might not point to a valid heap region: 0x7c5bc8bd9040
mormot: (this may still be a valid very large allocation (over 64MiB))
mormot: mimalloc: warning: mi_free: pointer might not point to a valid heap region: 0x7c5bc8be1040
mormot: (this may still be a valid very large allocation (over 64MiB))
mormot: mimalloc: warning: mimalloc: error: mimalloc: warning: mi_free: pointer might not point to a valid heap region: 0x7c5bb689a040
mormot: (this may still be a valid very large allocation (over 64MiB))
mormot: mimalloc: error: mi_free: pointer does not point to a valid heap space: 0x7c5bc8be1040
mormot: mi_free: pointer might not point to a valid heap region: 0x7c5bc8bd1040
mormot: (this may still be a valid very large allocation (over 64MiB))
mormot: mi_free: pointer does not point to a valid heap space: 0x7c5bc8bd9040
mormot: mimalloc: error: mi_free: pointer does not point to a valid heap space: 0x7c5bc8bd1040
mormot: mimalloc: warning: mi_free: pointer might not point to a valid heap region: 0x7c5bb68b2040
mormot: (this may still be a valid very large allocation (over 64MiB))
mormot: mimalloc: assertion failed: at "./include/mimalloc-internal.h":467, _mi_segment_page_of
mormot: assertion: "idx < segment->slice_entries"
mormot: mimalloc: assertion failed: at "./include/mimalloc-internal.h":467, _mi_segment_page_of
mormot: assertion: "idx < segment->slice_entries"
mormot: mimalloc: warning: mi_free: pointer might not point to a valid heap region: 0x7c5bb68a2040
mormot: (this may still be a valid very large allocation (over 64MiB))
mormot: mimalloc: warning: mi_free: pointer might not point to a valid heap region: 0x7c5bb68aa040
mormot: (this may still be a valid very large allocation (over 64MiB))
mormot: mimalloc: assertion failed: at "./include/mimalloc-internal.h":467, _mi_segment_page_of
mormot: assertion: "idx < segment->slice_entries"
I also do not see any problem there - may be this is false-positive warning. And I can`t reproduce crush on my environment
On laet run (with mimalloc) failure message is
mimalloc: assertion failed: at "./src/alloc.c":231, mi_page_usable_size_of
assertion: "ok"
assertion source - https://github.com/microsoft/mimalloc/b … loc.c#L231
P.S. Real assertion error is here - https://github.com/microsoft/mimalloc/b … loc.c#L223
P.P.S.
a test case where we fails (may be it helps to reproduce crash on your side):
wrk -H 'Host: 10.0.1.1' -H 'Accept: text/plain,text/html;q=0.9,application/xhtml+xml;q=0.9,application/xml;q=0.8,*/*;q=0.7' -H 'Connection: keep-alive' --latency -d 15 -c 256 --timeout 8 -t 56 http://10.0.1.1:8080/plaintext -s pipeline.lua -- 16
A pipeline.lua is here - https://github.com/TechEmpower/Framewor … peline.lua
On May 29 commit
- with hsoHeadersInterning warnings remains the same
- without hsoHeadersInterning only one remain
==1532807== Conditional jump or move depends on uninitialised value(s)
==1532807== at 0x48881F: MORMOT.CORE.BASE_$$_BYTESCANINDEX$PBYTEARRAY$INT64$BYTE$$INT64 (mormot.core.base.asmx64.inc:2091)
==1532807== by 0x6BA141: MORMOT.NET.HTTP$_$THTTPREQUESTCONTEXT_$__$$_PROCESSREAD$TPROCESSPARSELINE$BOOLEAN$$BOOLEAN (mormot.net.http.pas:3693)
About "shutdown may block" - from my experience it not block, but sometimes wait too long
I suspect Conditional jump or move depends on uninitialised value(s) in mormot.net.http.pas:3693 can be our problem - this is ProcessParseLine call
This is valgrind warnings I got on /plaintext with pipelining
==1515797== Thread 14 R1:8080:
==1515797== Use of uninitialised value of size 8
==1515797== at 0x527D43: MORMOT.CORE.DATA$_$TDYNARRAYHASHER_$__$$_FINDORNEWCOMP$LONGWORD$POINTER$TDYNARRAYSORTCOMPARE$$INT64 (mormot.core.data.pas:9774)
==1515797== by 0x9CC166DE: ???
==1515797== by 0x6D437AF: ???
==1515797== by 0x492B7F: ??? (in /home/pavelmash/dev/mORMot2/ex/techempower-bench/exe/raw)
==1515797==
==1515797== Use of uninitialised value of size 8
==1515797== at 0x527BCF: MORMOT.CORE.DATA$_$TDYNARRAYHASHER_$__$$_FINDORNEW$LONGWORD$POINTER$PPTRINT$$INT64 (mormot.core.data.pas:9718)
==1515797== by 0xDE: ???
==1515797== by 0x9CC166DE: ???
==1515797==
==1515797== Conditional jump or move depends on uninitialised value(s)
==1515797== at 0x5281D5: MORMOT.CORE.DATA$_$TDYNARRAYHASHER_$__$$_FINDBEFOREADD$POINTER$BOOLEAN$LONGWORD$$INT64 (mormot.core.data.pas:9891)
==1515797== by 0xFFFFFFFFFFFFFF1F: ???
==1515797==
==1515797== Use of uninitialised value of size 8
==1515797== at 0x527E88: MORMOT.CORE.DATA$_$TDYNARRAYHASHER_$__$$_HASHADD$LONGWORD$INT64 (mormot.core.data.pas:9815)
==1515797== by 0xAF5034F: ???
==1515797==
==1515797== Conditional jump or move depends on uninitialised value(s)
==1515797== at 0x48881F: MORMOT.CORE.BASE_$$_BYTESCANINDEX$PBYTEARRAY$INT64$BYTE$$INT64 (mormot.core.base.asmx64.inc:2091)
==1515797== by 0x6BA1E1: MORMOT.NET.HTTP$_$THTTPREQUESTCONTEXT_$__$$_PROCESSREAD$TPROCESSPARSELINE$BOOLEAN$$BOOLEAN (mormot.net.http.pas:3693)
==1515797==
==1515797== Thread 104 R1:8080:
==1515797== Use of uninitialised value of size 8
==1515797== at 0x527E88: MORMOT.CORE.DATA$_$TDYNARRAYHASHER_$__$$_HASHADD$LONGWORD$INT64 (mormot.core.data.pas:9815)
==1515797== by 0xAF5198F: ???
.....
Please wait: Shutdown 12 servers and 96 threads
==1515797== Thread 64 R1:8080:
==1515797== Conditional jump or move depends on uninitialised value(s)
==1515797== at 0x48881F: MORMOT.CORE.BASE_$$_BYTESCANINDEX$PBYTEARRAY$INT64$BYTE$$INT64 (mormot.core.base.asmx64.inc:2091)
==1515797== by 0x6BA11C: MORMOT.NET.HTTP$_$THTTPREQUESTCONTEXT_$__$$_PROCESSREAD$TPROCESSPARSELINE$BOOLEAN$$BOOLEAN (mormot.net.http.pas:3672)
==1515797== by 0xAF52E8F: ???
==1515797== by 0x6DEA18: MORMOT.NET.ASYNC$_$THTTPASYNCSERVERCONNECTION_$__$$_FLUSHPIPELINEDWRITE$$TPOLLASYNCSOCKETONREADWRITE (mormot.net.async.pas:4283)
==1515797== by 0x1FFEFFF06F: ???
==1515797== by 0x49247CF: ??? (pthread_create.c:321)
==1515797== by 0x76B61FF: ???
==1515797== by 0xA2D9900: ???
==1515797== by 0x76B61EF: ???
==1515797== by 0x6DED35: MORMOT.NET.ASYNC$_$THTTPASYNCSERVERCONNECTION_$__$$_ONREAD$$TPOLLASYNCSOCKETONREADWRITE (mormot.net.async.pas:4334)
==1515797==
Result with mimalloc is ready. Server crushes again on /plaintext (but runs a little longer)
Previous round craches with double free or corruption (out), for round with mimalloc log is not ready yet - may be it output gives us some ideas
But in general, mimalloc is slower than glibc, so I remove it and move /plaintext to be a last test (currently it is behind /cached-queries)
I think current run fails again because of /plaintext (therefore cached_query is not started at all).
BTW, previous run crushes on /plaintext with message munmap_chunk(): invalid pointer (pre-previous with malloc(): invalid next size (unsorted) )
PR with mimalloc just merged, let's see how we run with it in next round.
I expect that LD_PRELOAD works, because in case of invalid path to library I have been getting a warning.
I still can't run a ./raw on my machine, only in docker. Hope repair it and check valgrind again.
Ahh.. years ago I also tries to catch this bug without success.. I even tries with valgrind in memcheck mode, but there is so many warnings (mostly in mormot crypto) what the root of evils hidden.
For a while I made a MR 9036 what replace glibc MM with mimalloc.
I suggest we start looking into the valgrind warnings (see https://valgrind.org/docs/manual/mc-manual.html). Maybe if we remove them one by one, we will find the problem. By the way, the heisenbug also exists in mORMot1 - very rarely, once a week with an average load of 1000 RPS, I get an AV in my mORMot1 application in production.
Yes, crash in plaintext for 56 pinned server 8 thread each, this is why orm cached_queries not executed at all.
Error is
malloc(): invalid next size (unsorted)
This time we crashes on plaintext (not on cached queryes) on 256 concurrecy. See https://tfb-status.techempower.com/unzi … xt/raw.txt
Let's wait for a next round to detect is this a heisenbug or not
I also this this is out heisenbug - so I keep all "as is" for a next round
Unfortunately, with MR 8949, it seems that the server has crashed on /cached-queries (orm), so plaintext is not executed and we do not appear in the composite score this time. We will see what happens after round ends in text logs.
Increasing the number of threads for an asynchronous server cpuCount * 2 -> cpuCount * 4 does not increase the speed of async(nopin) (becomes slower)
But /cached_query implementation for raw server is now #1
Sorry for delay. Now crashes on TRawAsyncServer.rawqueries because uninitialized PLecuer is passed to GetRawRandomWorlds.
@ab - maybe we should go back to https://github.com/synopse/mORMot2/commit/d312c00d ? because these lecturers who are now everywhere have made the code unreadable (and not necessarily faster)
Now it compiles, all test except update passed, but for /update (and /rawpdate) ftb validation are not passed.
FAIL for http://tfb-server:8080/updates?queries=501
Only 1 items were updated in the database out of roughly 500 expected.
See https://github.com/TechEmpower/FrameworkBenchmarks/wiki/Project-Information-Framework-Tests-Overview#specific-test-requirements
PASS for http://tfb-server:8080/updates?queries=20
Executed queries: 10752/10752
See https://github.com/TechEmpower/FrameworkBenchmarks/wiki/Project-Information-Framework-Tests-Overview#specific-test-requirements
PASS for http://tfb-server:8080/updates?queries=20
Rows read: 10635/10240
See https://github.com/TechEmpower/FrameworkBenchmarks/wiki/Project-Information-Framework-Tests-Overview#specific-test-requirements
FAIL for http://tfb-server:8080/updates?queries=20
Only 506 rows updated in the database out of roughly 10240 expected
@ab - can you imagine what could have happened? The methods themselves have not changed... (Sorry, I'm not able to debug at the moment, only run tests in docker)
@ttomas - I'll increase threads 2 -> 4 for [async,nopin]
@ab - Please, fix raw.pas, because it do not compiles (TRawAsyncServer.ComputeRandomWorld is not accessible from inside TAsyncWorld.DoUpdates / Queries) - I'll follow your fix and made an MR
I'll do MR with the latest changes..
As for `q=`, I tested this case a year ago and it gives nothing. I'm afraid that even if we do this, we will get a ton of criticism, just like with Server-Name. I'll make it a separate MR (after MR with a latest changes)
About `using a modulo of the connection ID` - but what if we have 1001 client - we can`t use one connection for both 1 and 10001 . As far as I remember I test an implementation with per-worker connection (call ThreadSafeConnection once and memorize it into worker context) and performance is near the same as with ThreadSafeConnection
BWT currently our raw server creates 448 DB connections (num servers=56, threads per server=8, total threads=448, total CPU=56, accessible CPU=56, pinned=TRUE, db=PostgreSQL), maybe I'll increase `threads per server' to 10 to get 560 connections, so each concurrent client will have its own - that might work (after MR with q=)
Verified with 2.2.7351 - now escaping is OK. TFB MR #8883 is ready.
I don't bother with mimalloc either - we'll see how it goes with the current improvements.
@ab - it seems escaping is broken in latest Mustache implementation
Valid fortune resp should be (<script>)
<tr><td>11</td><td><script>alert("This should not be displayed in a browser alert box.");</script></td></tr>
but current implementation do not escape {{ }} template and result is
<tr><td>11</td><td><script>alert("This should not be displayed in a browser alert box.");</script></td></tr>
Yes, disabling sleep should help for the asynchronous server, because according to dstat (now it's running) the processor is idle ~66% (only 40% is used). By the way, for a raw, idle is ~44%, which is also too high, for example, in h2o idle is ~20%.
In today's result based on mORMot2.2.stable, our score increased from 18,281 to 19,303, I hope we will be #5.
I'll test fresh sources on my computer (my test server is in a region with power outages and is unavailable) and do a MR (hope today)
One more observation is that may-minihttp, ntex, and xitca-web use mimalloc. I also plan to try to use it in one of the next rounds. I've already tried mimalloc on my server - performance is the same as with libc alocator, but on the new TFB hardware it might change the numbers a bit.
P.S.
#6 is a good place IMHO
Currently, with 56 CPUs, mormot [async,nopin] (1 process, 112 threads, no pinning) is good for updates, but bad for other scenarios (compared to async with pinning and direct pinning).
Let's wait for the next launch with the latest PR (the current launch is based on the same sources as Round22).
The bad thing is that we don't have a test server that matches the TFB configuration now, so we can't test different thread/server/pin combinations to find the best one.
Thanks a lot! BTW new binaries can be released with SQLite3 3.45.1 (where JSON and read performance improvements are introduced)
To solve a problem, described in #211 ( `GLIBC_2.34' not found ) I compile a production version of my product using docker with fpc what based on Ubuntu 20.04 (GLIBC 2.31)
I found this commit in sqlite3 build what replace all symbols pthread->_pthread + dl->_dl and issue #211
To solve my problem I do a backward replacement, and now linker is happy.
Linux script what do a replacing is here
Still do not understand why it compiles in mORMot2.
I'm confused a little, because the same sqlite3.o is used in mormot2.2_stable and TFB example compiles well. Any help is wellcome
@ab - I'm trying to upgrade SQLite3 in my mORMot1 application from 3.31.0 to 3.44.2 (to the latest static version uploaded to the mORMot repository), but I get various linking errors related to pthread and dlopen (sorry for big fragment) under x86_64-linux (win64 is OK)
(9015) Linking bin/fpc-linux/ub
/usr/bin/ld: libs/Synopse/./static/x86_64-linux/sqlite3.o: in function `sqlite3_strlike':
sqlite3mc.c:(.text+0xc735): undefined reference to `_pthread_mutex_trylock'
/usr/bin/ld: libs/Synopse/./static/x86_64-linux/sqlite3.o: in function `sqlite3_free_filename':
sqlite3mc.c:(.text+0xe345): undefined reference to `_pthread_mutex_destroy'
/usr/bin/ld: libs/Synopse/./static/x86_64-linux/sqlite3.o: in function `sqlite3_create_filename':
sqlite3mc.c:(.text+0x20bbc): undefined reference to `_pthread_create'
/usr/bin/ld: libs/Synopse/./static/x86_64-linux/sqlite3.o: in function `sqlite3_uri_int64':
sqlite3mc.c:(.text+0x2bbc4): undefined reference to `_pthread_mutexattr_init'
/usr/bin/ld: sqlite3mc.c:(.text+0x2bbd1): undefined reference to `_pthread_mutexattr_settype'
/usr/bin/ld: sqlite3mc.c:(.text+0x2bbdc): undefined reference to `_pthread_mutex_init'
/usr/bin/ld: sqlite3mc.c:(.text+0x2bbe4): undefined reference to `_pthread_mutexattr_destroy'
/usr/bin/ld: sqlite3mc.c:(.text+0x2bc30): undefined reference to `_pthread_mutex_init'
/usr/bin/ld: sqlite3mc.c:(.text+0x2c7ce): undefined reference to `_pthread_join'
/usr/bin/ld: libs/Synopse/./static/x86_64-linux/sqlite3.o: in function `sqlite3_load_extension':
sqlite3mc.c:(.text+0x4246e): undefined reference to `_dlerror'
/usr/bin/ld: libs/Synopse/./static/x86_64-linux/sqlite3.o: in function `sqlite3_compileoption_used':
sqlite3mc.c:(.text+0xb804): undefined reference to `_dlclose'
/usr/bin/ld: sqlite3mc.c:(.text+0xb817): undefined reference to `_dlsym'
/usr/bin/ld: sqlite3mc.c:(.text+0xb829): undefined reference to `_dlopen'
/usr/bin/ld: libs/Synopse/./static/x86_64-linux/sqlite3.o: in function `sqlite3_strlike':
sqlite3mc.c:(.text+0xc721): undefined reference to `_pthread_mutex_unlock'
/usr/bin/ld: sqlite3mc.c:(.text+0xc751): undefined reference to `_pthread_mutex_lock'
UB.lpr(56,1) Error: (9013) Error while linking
I tried under Ubuntu 22.04 (GLIBC 2.35) and Ubuntu 20.04 (GLIBC 2.31) with the same results..
I assume that you compile Linux static under newer OSes (newer libc) and this is the reason?
There is update with new TFB environment https://github.com/TechEmpower/Framewor … 1973835104
56 logical CPU and 40Gb network.
Still waiting for test to be resumed....
I doubt what is being measured - 1024 loop is too small.
I increase loop to be measured
for (i = 0; i < 1024*1024*1024; i++)
and create 2 program - one what call library from c, and one - from pascal (as you provide) - the results is the same
$ ./forloop_c.out
space 129340 clock_t value -536870912
$ ./forloop_pas
space 129318 clock_t value -536870912
After almost a year of discussions our idea for PostgreSQL pipe-lining mode improvement is merged into PostgreSQL upstream. After Postgres 17 release on September 2024 we will switch to it - for a while I do not want to increase a container build time by adding a libpq compilation from sources
It is not easy here, in Ukraine, but this winter is so far easier than the previous one - we were ready for the russian terror.
As for TFB, let's look at the new HW. It's a shame that our last commit didn't play on the old HW - we don't have a base for comparison. After one run on the new HW, I plan to switch our test to mormot@2.2 (and possibly change threads/servers). Life goes on..
Good idea
For my part, I'll try to use `mimalloc` in the next run - the top rust frameworks `ntex` (made by MS, by the way), `may-minihttp`, `xitca-web` use mimalloc. On my computer, mimalloc works a little worse, but on FTB the situation may change
PR 8612 is ready.
/cached-queries memory issues occurred too rarely (once per 10 TFB run) - I also can`t reproduce it on my environment
As for the performance of single select, I'm afraid that the only way to improve it is to switch the Postgres connection to non-blocking mode (PQsetnonblocking).
But IMHO this requires a huge rework of the framework to become event driven:
- we need a .NET-like pool of database connections (with minimum and maximum size) that is not based on the thread ID (threadSafeConnection), but manages a list of all connections to one database and can provide the user with an unoccupied connection (or block if all connections are busy)
- Ideally, the database connection sockets waiting for the result should be in the same epool as HTTP sockets, so we need a callback-based event loop backed by an epoll (like in libuv). This is a complete redesign of the current HTTP server architecture
Another IMHO:
An event-driven, callback-based architecture is the only choice we have with FPC. But it's a road to callback hell. I know very well what this means because I worked with callbacks in early JS for 5 years, up until Promises. Any complex logic based on callbacks is hell.
So my suggestion is to stay where we are. After all, we are now in the top 3 for ORM, and our code is production-ready unlike many others on TOP
Yes, I'll do it on this weekend
Nice finding, for shure. Sometimes Pascal is unsafe....