#1 Re: mORMot Framework » High-performance frameworks » Yesterday 14:27:23

mpv

I found detailed what for completed rounds TFB publish detailed statistic - "details" link above "visualize - here is for last completed round - https://tfb-status.techempower.com/unzi … /mormot/db 
The only interesting thing i see is "Total CPU usage".

For example for cachedQueries it is here - https://tfb-status.techempower.com/unzi … s.txt.json
First number is # of wrk execution from raw.txt (first two "Running Primer..." and "Running Warmup" is ignored) so stat for "cached-queries?count=20" is on #2 - when timestamps

For for "cached-queries?count=20" statistic shows

 
total cpu usage	
sys	1.786
stl	0
idl	0
usr	98.214

what is strange...

For db = 38% is idle, what strange too

#2 Re: mORMot Framework » Strange index sqlite database corruption » 2022-12-07 11:33:16

mpv

Do you mean data exists in SQLite table but you can't select it using SQL like `select * from user_table where logonName = 'a_d@ap.com'` ?
Are you sure equal (=) condition is used while selecting? Because for `like` condition underscore `_` wildcard matches any single character

#3 Re: mORMot Framework » High-performance frameworks » 2022-11-29 19:28:50

mpv

I pushed  version with CPU*4 + affinity. I think we need to have comparable results with previous tests to see is affinity really helps or not.
BTW @ttomas posts a link to  similar bench for PG on AWS (https://pganalyze.com/blog/postgres-14- … monitoring), but from my POW and my tests we need at least workers=CPUs * 2, because for workers=1*CPUs at least half the time (wile mORMot parse HTTP request, serialize\deserialize results and sends responce) postgres will wait.
I sets CPUs * 4 because on valgring profiling results for /db PG part takes ~25% of all work (before affinity fix)
Let's wait for PR 7755 results

#4 Re: mORMot Framework » High-performance frameworks » 2022-11-28 22:17:25

mpv

Now all works as expected. Thanks! From my tests db related queries become even a little faster. Can you, please do a mORMot release with sqlite 3.40.0 wonted by latest sources, because current 2.0.4148 release contains sqlite 3.39.4.

#5 Re: mORMot Framework » High-performance frameworks » 2022-11-28 19:28:57

mpv

This is how htop looks for 96 threads with affinity masks - https://drive.google.com/file/d/166Op86 … sp=sharing

and this is perf stat

$ perf stat ./raw 96
THttpAsyncServer running on localhost:8080; num thread=96 db=PostgreSQL
 Performance counter stats for './raw 96':

         25588,36 msec task-clock                #    1,676 CPUs utilized          
           472073      context-switches          #   18,449 K/sec                  
                92      cpu-migrations            #    3,595 /sec                   
             1430      page-faults               #   55,885 /sec                   
       3,859381000 seconds user
      22,026711000 seconds sys

I like this statistic

As for me binding of threads to CPU is OK for TFB, because at the same time all threads handle near the same type of load, only our application is running on host and so on. But in real life I not sure it is valid approach. Or not? May be add an option for this?

#6 Re: mORMot Framework » High-performance frameworks » 2022-11-28 19:24:25

mpv

I mean not this, but instead of

if CpuSockets > 1 then
 ......

use

for i := 0 to aThreadPoolCount - 1 do
  ok := SetThreadCpuAffinity(fThreads[i], i mod SystemInfo.dwNumberOfProcessors);

this is exactly what h2o did as far as i understand.

After my changes I got the same plaintext ~700k RPS for both 96 and 24 mode!
But I do not sure it applicable for real production sad

#7 Re: mORMot Framework » High-performance frameworks » 2022-11-28 19:01:11

mpv

@ab - as far as I understand you mistakes a little about pthread_attr_setaffinity_np - it sets affinity mask for logical cores (including hyperthreaded), not for physical CPU. In other words - cores are

ls -dl /sys/devices/system/cpu/cpu[0-9]*

My server is off due to electricity (do not know at all when I can turn it back, country is in power-saving mode), but most of my prev. tests I did on my PC with one socket / 6phsical / 12logical cores (cpu0 - cpu12 for command above)

And perf stat  cpu-migrations is migrations over cores also

#8 Re: mORMot Framework » High-performance frameworks » 2022-11-27 12:13:08

mpv

My observations:
- on my hardware I also DO NOT reproduce bad performance for "/cached-queries?count=20" - it gives 75 561 RPS vs, for example, 9 535 RPS for "/queries?queries=20"   
- using poll gives -10% RPS vs epoll. R0 is on top also. Degradation is the same for 24 vs 96 thread, so problem is not in poll/epoll 

About CPU scheduling - you are right - there is non-normal increasing of cpu-mirgation (and HIGH context-switch in both case, IMHO too) and user-space CPU utilization for both poll and epoll in case we increase workers count - see truncated perf output below for 24 vs 96 workers
I think this can be a root of problem. I found in h2o test what author sets affinty mask for each thread manually - see  here
Do you think it can help?

$ perf stat ./raw 24
 Performance counter stats for './raw 24':

         63457,69 msec task-clock                #    4,301 CPUs utilized          
         1739496      context-switches          #   27,412 K/sec                  
             7200      cpu-migrations            #  113,461 /sec                   
             1021      page-faults               #   16,089 /sec              
...     
      13,730222000 seconds user
      50,388775000 seconds sys

 

$ perf stat ./raw 96
 Performance counter stats for './raw 96':

         78413,93 msec task-clock                #    5,293 CPUs utilized          
         3461114      context-switches          #   44,139 K/sec                  
           117782      cpu-migrations            #    1,502 K/sec                  
             1460      page-faults               #   18,619 /sec                   
...
      36,417579000 seconds user
      42,630085000 seconds sys

#9 Re: mORMot Framework » High-performance frameworks » 2022-11-26 18:25:02

mpv
ab wrote:

look at https://forum.lazarus.freepascal.org/in … #msg461281
Where our little mORMot can do wonders for a more realistic use case than the TechEmpower benchmark. wink

Nice. Will be good to place mORMot numbers to the prokject README.md, because there is no Pascal numbers on the main page sad   
But TFB is well known benchmark and will be good to be in top20 there IMHO.

I see strange results for /rawqueries in pipelining mode - 636RPS is very strange to me and I can't reproduce it sad

#10 Re: mORMot Framework » High-performance frameworks » 2022-11-26 18:12:49

mpv

Nice - now I know how to turn on custom thread name in htop smile

Command

wrk -d 10 -c 512 -t 8 "http://192.168.1.6:8080/plaintext"

HTOP
24 th ~ 700k RPS : 24 thread mode  (https://drive.google.com/file/d/1LDU3C1 … sp=sharing)
96 th ~ 400k RPS: 96 thread mode (https://drive.google.com/file/d/1_FmtiQ … sp=sharing)

sorry for delay

#11 Re: mORMot Framework » High-performance frameworks » 2022-11-25 10:32:33

mpv

I'm currently most of the time w\o electricity, GSM and internet. Hope we recover our infrastructure soon... In any case, it is better than russians occupation.
As I noted above - all threads uses near the same portion (about all load in one thread - i'ts my measurement mistake) of CPU, but when I increase workers count, overall CPU usage moved from kernel space (red mark in htop) to user space (green mark)

#12 Re: mORMot Framework » High-performance frameworks » 2022-11-13 17:09:20

mpv

Update to prev. post - my measurement mistake - all threads uses near the same portion of CPU (25% - 9%). But overall CPU usage moved from kernel space to user space as noted above

#13 Re: mORMot Framework » High-performance frameworks » 2022-11-13 15:47:43

mpv

About problems with /plaintext endpoint:

My current exploration shows what if I increase thread pool then CPU consummation for main thread (in user space, so this is our code - not epoll) increases, and performance - decreases.
Content of HTTP headers doesn't meter, problem is reproduced without headers (so it's not a header parser) using command

wrk -d 10 -c 512 -t 8 "http://192.168.1.6:8080/plaintext"

On 12 CPU desktop
for "./raw 24"  main thread of 24-thread async server consume 600% CPU (6 core), 50% of the overall CPUs load is in the kernel space, result is 751294RPS
......
for "./raw 96"  main thread of 96-thread async server consume 1000% CPU (10 core), 90% of the overall CPUs load is in the USER space, result is 262234RPS

Unfortunately under valgrind problem is not reproduced, because valgrind in instrumentation mode is slow itself...

#14 Re: mORMot Framework » High-performance frameworks » 2022-11-12 12:25:48

mpv

@ab - I remove SQLite3 code from TFB tests to prevent confusions like this - https://github.com/TechEmpower/Framewor … ssues/7692
Also increase threads to CPUCount*5 (instead of *4), so will be 24*5 =  120

#15 Re: mORMot Framework » High-performance frameworks » 2022-11-12 11:52:09

mpv

From my tests

mpv wrote:

See full result in Calc on Google Drve

In short:
- on 6/12 Raizen5 CPU  best result is for 64 App threads.
- on 2x6/12 Xeon CPU - for 128 APP threads

on 256 threads I have a bad result. Suppose "reactor" thread (thread where async HTTP is performed) become a bottleneck but not shure

#16 Re: mORMot Framework » High-performance frameworks » 2022-11-11 11:27:29

mpv

As expected - results for server with 96 workers are better when with 64 as in previous round. Below is /rawdb concurrency compare

Con      16	    32	        64	        128	        256	        512
64thRPS  69406	116648		217865		305213		308233		306508
96thRPS  68852	116590	        217524	        232641	        354421	        349668

As a result on the end of this run (approximately) we got +20 position for db and queries, +10 for fortunes:
/rawdb  #51 (instead of #76)
/db       #52 (instead of #77)

/rawfortunes #43 (#52)
/fortunes       #73 (#78)

/rawqueries  #62 (#81)
/queries        #74 (#83)

Hope pipelining MR will be merged by TFB team and on next run we got much better results for /rawqueries (I expect we will be at last #30 or even better)

#17 Re: mORMot Framework » High-performance frameworks » 2022-11-07 17:54:32

mpv

I made a PR to TFB based on [d2a2d829] commit. On my local 12-core environment (after editing "query_url": "/rawqueries?queries="  in benchmark_config.json because I don't know how to run raw test locally)

./tfb --test mormot --type query --query-levels 20
wrk -H 'Host: tfb-server' -H 'Accept: application/json,text/html;q=0.9,application/xhtml+xml;q=0.9,application/xml;q=0.8,*/*;q=0.7' -H 'Connection: keep-alive' --latency -d 15 -c 512 --timeout 8 -t 12 "http://tfb-server:8080/rawqueries?queries=20"

WITH PIPELINING
Requests/sec:  17927.58

W/O PIPELINING
Requests/sec:  10449.76

Today should starts a new TFB pipeline with our MR where threads is not limited to 64 and we will see is it helps with concurrency,  hope next run will include pipelining. Let's wait. Have a good vocation!

P.S.
In my previous posts I miss `-query-levels 20` parameter, so /query endpoint is verified with 1 query, this is why numbers are so big

#18 Re: mORMot Framework » High-performance frameworks » 2022-11-06 19:24:41

mpv

I implement an SQL pipe-lining for PostgreSQL in synchronous mode - see MR #127.
For TFB "/rawqueries?queries=20" endpoint it gives ~70% boots, for "rawupdates?queries=20" ~ 10%

When MR will be accepted (it's good to do a mORMot release after this) I'ill do MR to TFB.

@ab - please, look, may be in future you can add a pipelining to the ORM level also?

#19 Re: mORMot Framework » High-performance frameworks » 2022-11-02 11:22:12

mpv

I think it's possible to implement a SQL pipelining for Postgres in synchronous mode. It should speed up a  `/queries` endpoint. I will do it for raw mode, and may be even for ORM

#20 Re: mORMot Framework » High-performance frameworks » 2022-11-01 11:08:57

mpv
ttomas wrote:

Testing on same HW will produce false thread/workers optimization. We need 2 servers (app and db) with same cpu/cores.

Yes, you are right (more over - we need 3 servers - third for wrk), but currently I do not have such hardware.
About "connection=cpu cores" - all framework on top of TFB rating are use HTTP and Postgres pipelining. This IMHO is good for benchmarking or to write tools like PgBouncer, but in real life this is a big pain (problems with transactions, debugging and so on).
So our approach with fixed thread pool of synchronous connections is OK. But in this case we need  connection > cpu cores. Yes, some connections will be idle in this case (while app server parse HTTP headers and so on), but (as noted in article you provide) this is not a big problem for connection count < 500.

My propose for now (see my MR) is to remove 64 worker limitation, so on Citrine (in case there is one CPU there) we spawn 24 Cores*4 = 96 working threads. Let's wait for next run and see.

And another note - even with current results if we filter for ORM type=full in /fortunes tests mORMot is #6 ! . #1 is lithium, but it's not a true ORM as noted by @ab in this post. So actually we are #3, what is PERFECT IMHO.

#21 Re: mORMot Framework » High-performance frameworks » 2022-10-31 20:59:15

mpv

Huge problems with electricity today in Ukraine. Horde is very active on Mondays.
But I test on 2 environment - first is
1)  Wrk on Laptop -> 1Gb Ethernet -> DB+App on Raizen5 PC (12 thread CPU)
Second is 
2) DB+wrk+APP on   2x Intel® Xeon® Processor E5649 ( 24 thread CPU total)

See full result in Calc on Google Drve

In short:
- on 6/12 Raizen5 CPU  best result is for 64 App threads.
- on 2x6/12 Xeon CPU - for 128 APP threads

From TFB Citrine environment I do not understand Is their Dell R440  equipped with 2 CPU or with one (both is possible)
If Citrine is 2 CPU - we should limit app server thread pool size to 256, if one - to 128 (instead of 64).

P.S.
On Xeon processors test results are very reproducible in opposite to  Desktop CPU (even with good cooling). On Laptop CPU performance tests is near to impossible because of trotting.

P.S.2
Does anyone understand how many CPU in on Citrine? Tomorrow I plane to prepare a PR for TFB with new thread pool limits and latest 2.0.4148 mORMot

#22 Re: mORMot Framework » High-performance frameworks » 2022-10-30 09:41:57

mpv

There is definitely a scalability problem. If we look at "Data table" tab for "Single query", we`ll se what we do not scales linear during increasing of connections count.

Con  16	        32	        64	        128	        256	        512
RPS  69406	116648		217865		305213		308233		306508

If we sort data table by conn count (I copy it to Calc), when for 16 connection raw mORMot is #10, for 32 - #9, 64 - #22, 128 - #23, 256 - #37, 512 - #46. Near the same distribution is for "orm" mORMot.
While top frameworks shows its best on 512 connections.

So my expectation what on powerful hardware we cat got some unexpected results is unfortunately true (I test on 6core/12thread CPU, while TFB Citrine environment is 2x 14 core/28 thread CPU) 
Hope the main problem is what we limit DB connection pool to 64.

PostgresSQL in TFB test suite is configured to 2000 max_connections (default is 100!).
I got a temporary access to server with 24 cores, and (when blackouts allow) will try to play with pool size.

#23 Re: mORMot Framework » High-performance frameworks » 2022-10-27 13:06:30

mpv

Yes, tests is running smile
Here in Ukraine we currently have a huge problems with electricity - russian terrorists destroys our electric transformers across the country, so we save energy as much as possible. Positive - I'm reading paper books again during blackouts smile And we believe in the Armed Forces of Ukraine together with help from all civilized countries of the world.

#24 Re: mORMot Framework » QUIC from google » 2022-10-04 17:51:23

mpv

Implementing QUIC from scratch *correctly* is TOO HARD. I know one open source server-side implementation from M$ (with C API), but this is a big dependency.
In any case, benefits of QUIC is not too big in usual scenarios. Scenarios where it shine is a networks with small bandwidth and bad stability, where HTTP over TCP is too unstable. But IMHO it is better not to get into situations where it may be necessary

#25 Re: mORMot Framework » Web version of LogView app » 2022-10-03 07:36:58

mpv

New version of LogView is published into https://unitybase.info/logview/index.html
- added support for high-resolution timestamps encoding
- SQL log level preview now automatically beautified

BTW if it is possible, I can suggest to structure a DB/SQL log level output to be more parsable (either by logview or by grep etc.)
Currently output format is human readable

DB      mormot.db.sql.postgres.TSqlDBPostgresStatement(7f8105ed9620) Prepare 1.98ms cached as 01 select ID,RandomNumber from public.World where ID=?
SQL     mormot.db.sql.postgres.TSqlDBPostgresStatement(7f8105ed9620) ExecutePrepared 2.19ms 01 rows=1 select ID,RandomNumber from public.World where ID=2799

I propose machine readable  t=nanosec c=cache r=rows q=SQL format. Example:

DB      mormot.db.sql.postgres.TSqlDBPostgresStatement(7f8105ed9620) Prepare t=1980 c=01 q=select ID,RandomNumber from public.World where ID=?
SQL     mormot.db.sql.postgres.TSqlDBPostgresStatement(7f8105ed9620) ExecutePrepared t=2190 c=01 r=1 q=select ID,RandomNumber from public.World where ID=2799

Also it is good to have SQL logged without replacing a parameters by it's values - query plane of parametrized SQL almost always differs from inlined values.
In UnityBase I wrote parameters and parametrized query as such:

Params  	{"P1s39":"doc.main.AddRightsToPositionAssignments"}
SQL     	r=1 t=1821 fr=1817 c=0 q=SELECT A01.ID,A01.type,A01.settingKey,A01.settingValue  FROM ubs_settings A01  WHERE A01.settingKey=:1

where:
- Params = sllCust1 log level in my case
- fr=  is a Time to first row
- q=  is TSQLDBStatement.SQLCurrent

#26 Re: mORMot Framework » Web version of LogView app » 2022-09-30 20:24:56

mpv

I will add Hi Resolution timer support and formatting of query in "preview" for mORMot2 SQL log level tomorrow (almost ready)

#27 mORMot Framework » Web version of LogView app » 2022-09-30 10:54:31

mpv
Replies: 4

I create a WEB version of LogView app using VueJS - available here  https://unitybase.info/logview/index.html
Compressed application size is ~ 1.2Mb a lite bit fat, because I uses UnityBase UI components library and it do not support tree shaking yet (all available components are included)

Current edition implements:
  - opening of local uncompressed filed
  - automatic SQL/JSON beautify
  - all filters available in original LogView app
  - statistic
  - method profiling (top 1000 methods by time)

I wondering how fast and efficient is JS in modern browsers - I test it with files up to 512 Mb and it works pretty fast (at last on my PC. A little faster in Chrome compared to Firefox. Not tested in Safari at all). On 1Gb file fails parsing fails because it implemented using String.split

#28 Re: mORMot Framework » High-performance frameworks » 2022-09-04 09:05:22

mpv

On my 12 thread desktop (Rizen5 5600G overclocked to 4.2GHz, DDR4 memory DDR4-2666 (16)) there is no visible difference after mustache optimizations.
New test result are little slower, compared to one from 2022-08-13, but this is because I add 2 RAM module, and now can't overclock memory to DDR4-3200:

The best result as expected) is on -c 12 wrk mode (12 threads)
for ./ftb --benchmark:
- fortunes before mustache opt - 124114 RPS
- fortunes after    mustache opt - 124510 RPS

If a run server on host and manually test

wrk -H 'Host: tfb-server' -H 'Accept: application/json,text/html;q=0.9,application/xhtml+xml;q=0.9,application/xml;q=0.8,*/*;q=0.7' -H 'Connection: keep-alive' --latency -d 15 -c 128 --timeout 8 -t 12 "http://tfb-server:8080/fortunes"

- fortunes before mustache opt - 126819
- fortunes after    mustache opt - 127688

BTW when Postgres is in container and app is on host - `docker-proxy` become a bottleneck ( tfb runs all part in docker-compose, so there is no docket proxy in their case)
I disable it by forcing docker port forwarding to use iptables  - add a  { "userland-proxy": false } in /etc/docker/daemon.json and now results from tfb util and from manual test execution is near the same/

#29 Re: mORMot Framework » High-performance frameworks » 2022-09-03 08:47:00

mpv

Unfortunately /fortune returns empty table (without rows) on current master... Data are retrieved correctly, so this is mustache problem

#30 Re: mORMot Framework » TryLoadLibray PostgreSql lib » 2022-08-25 18:54:18

mpv

For mROMot1 there is global variable  SynDBPostgresLibrary in unit SynDBPostgres - set it to libpq.dll location (for example SynDBPostgres := 'D:/PostgreClient/10/x64/bin/libpq.dll') and all should woks without playing with PATH.
BTW PostgreSQL on Windows works extremely bad - it designed for Linux

#31 Re: mORMot Framework » High-performance frameworks » 2022-08-23 07:03:33

mpv

For TFB tests compression is not permitted - see rule ix of requirements

In mORMot client <-> mORMot server scenarios proprietary synLZ is used.
In real life hi-load Web scenarios (mORMot <-> reverse proxy <-> browser) IMHO preferred compression is Brotli see comparison with gzip , and it can be enabled on reverse proxy level

For gzip compession mORMot uses libdeflate - @ab - it's a good idea to note a sources, used to build static libraries for mORMot in statics/README,.md - `/res/static/` is enough.
And inside /res/static/ libraries folders - link to original sources, because currently it is not clear what exactly implementation is used.

#32 Re: mORMot Framework » High-performance frameworks » 2022-08-22 16:05:06

mpv

We newer test mORMot before on such powerful hardware (  Intel Xeon Gold 5120 CPU, 32 GB, enterprise SSD. 3 servers (DB, app and load generator) connected using dedicated Cisco 10-gigabit), some unexpected things may happens, but hope everything will be OK.

#33 Re: mORMot Framework » High-performance frameworks » 2022-08-22 16:00:53

mpv

Our TFB pull 7481 is merged into master.
Next  tfb-status check start  in ~97 hours, so we got a results ~after 225 hours =  2022-09-01

#34 Re: mORMot Framework » High-performance frameworks » 2022-08-13 14:08:39

mpv

Today`s state: +10% for fortunes thanks to TSynMustache.RenderDataArray()

Max RPS:
┌--------------┬------------┬------------┬------------┬------------┬------------┐------------┐------------┐
│   (index)    │mormot(0720)│mormot(0730)│mormot(0801)│mormot(0802)│mormot(0813)│ drogon     │ lithium    │
├--------------┼------------┼------------┼------------┼------------┼------------┤------------┤------------┤
│   fortune    │   74318    │   90500    │   91287    │   113073   │   126055   │   176131   │   90064    │
│  plaintext   │   920198   │   977024   │   986253   │  1436231   │  1373177   │  3583444   │  3388906   │
│      db      │   111119   │   116756   │   117624   │   153009   │   154033   │   176776   │   99463    │
│    update    │   10177    │   10108    │   10981    │   15476    │   15336    │   90230    │   25718    │
│     json     │   422771   │   446284   │   458358   │   590979   │   584294   │   554328   │   544247   │
│    query     │   106665   │   113516   │   114842   │   148187   │   149122   │   171092   │   94638    │
│ cached-query │   384818   │   416903   │   419020   │   547307   │   551230   │            │   528433   │
└--------------┴------------┴------------┴------------┴------------┴------------┘------------┘------------┘

#35 Re: mORMot Framework » mORMot2 - json operations » 2022-08-09 06:41:12

mpv

In mORMot2 I uses Rtti.RegisterFromText from mormot.core.rtti.pas - here is a live example for array of record

#36 Re: mORMot Framework » High-performance frameworks » 2022-08-01 17:58:21

mpv

How fast is new MoveFast in mORMot? So fast what I decide to add a TFB #1 drogon for comparition (results are without latest Postgre improvements)

Max RPS:
┌──────────────┬────────────┬────────────┬────────────┬────────────┬────────────┐────────────┐
│   (index)    │mormot(0720)│mormot(0730)│mormot(0801)│ mormot(mf) │ drogon     │ lithium    │
├──────────────┼────────────┼────────────┼────────────┼────────────┼────────────┤────────────┤
│   fortune    │   74318    │   90500    │   91287    │   113073   │   176131   │   90064    │
│  plaintext   │   920198   │   977024   │   986253   │  1436231   │  3583444   │  3388906   │
│      db      │   111119   │   116756   │   117624   │   153009   │   176776   │   99463    │
│    update    │   10177    │   10108    │   10981    │   15476    │   90230    │   25718    │
│     json     │   422771   │   446284   │   458358   │   590979   │   554328   │   544247   │
│    query     │   106665   │   113516   │   114842   │   148187   │   171092   │   94638    │
│ cached-query │   384818   │   416903   │   419020   │   547307   │            │   528433   │
└──────────────┴────────────┴────────────┴────────────┴────────────┴────────────┘────────────┘ 

#37 Re: mORMot Framework » High-performance frameworks » 2022-08-01 14:45:23

mpv

I add new hsoNoStats option.

Also another small (1%) improvements PR#111:
- our StrLen is twice faster compared to PQGetLength 0.2% vs 0.4% on TFB /db
- prevent unnecessary PQGetIsNull call - should be called only for empty string (to distinguish null and empty string result)  0.6%

#38 Re: mORMot Framework » High-performance frameworks » 2022-08-01 10:52:46

mpv

Fresh results

Max RPS:
┌──────────────┬────────────┬────────────┬────────────┬────────────┐
│   (index)    │mormot(0720)│mormot(0730)│mormot(0801)│ lithium    │
├──────────────┼────────────┼────────────┼────────────┼────────────┤
│   fortune    │   74318    │   90500    │   91287    │   90064    │
│  plaintext   │   920198   │   977024   │   986253   │  3388906   │
│      db      │   111119   │   116756   │   117624   │   99463    │
│    update    │   10177    │   10108    │   10981    │   25718    │
│     json     │   422771   │   446284   │   458358   │   544247   │
│    query     │   106665   │   113516   │   114842   │   94638    │
│ cached-query │   384818   │   416903   │   419020   │   528433   │
└──────────────┴────────────┴────────────┴────────────┴────────────┘

We achieved performance at which room temperature changes affects measurement smile So I upgrade my PS to middle-tower with manual cooler speed switch. During normal work I sets it minimal for silence, during measurement - to maximum to prevent CPU temperature growing.

#39 Re: mORMot Framework » High-performance frameworks » 2022-07-31 16:15:59

mpv

Another small improvement of HTTP header parser PR#110 - use SSE PosChar to find line ending

#40 Re: mORMot Framework » High-performance frameworks » 2022-07-30 20:43:06

mpv

A small improvement of HTTP header parser - PR# 109 (hope w/o bugs). +4000RPS on /json, +300RPS on /db

#41 Re: mORMot Framework » High-performance frameworks » 2022-07-30 09:36:09

mpv

Nice! For a 10 days results are improved - here is a comparison between mormot from 2022-07-20 and 2022-07-30

Max RPS:
┌──────────────┬────────────┬────────────┬────────────┐
│   (index)    │mormot(0720)│mormot(0730)│ lithium    │
├──────────────┼────────────┼────────────┼────────────┤
│   fortune    │   74318    │   90500    │   90064    │
│  plaintext   │   920198   │   977024   │  3388906   │
│      db      │   111119   │   116756   │   99463    │
│    update    │   10177    │   10108    │   25718    │
│     json     │   422771   │   446284   │   544247   │
│    query     │   106665   │   113516   │   94638    │
│ cached-query │   384818   │   416903   │   528433   │
└──────────────┴────────────┴────────────┴────────────┘

#42 Re: mORMot Framework » High-performance frameworks » 2022-07-29 15:22:41

mpv

I fix missed ':' in PR#140. I miss it because `/db` endpoint actually do not return TOrm JSON , but reformat it using FormatUTF8.
May be better to introduce new class method and rewrite `/db` as

ctxt.OutContent := TOrmWorld.RetrieveAsJson(fStore.Orm, RandomWorld);

?

#43 Re: mORMot Framework » High-performance frameworks » 2022-07-29 07:45:49

mpv

With latest sources a possible optimization target is THttpRequestContext.ParseHeader( (6.6% of time for /db)
The simplest optimization IMHO is to remove 2 unnecessary headers from PARSEDHEADERS (SERVER-INTERNALSTATE and X-POWERED-BY).
More complex is use ideas from picohttpparser - it described here starting from slide 31 

Also small +500 RPS /db improvement RP 104 - avoid FPC string concatenation

#44 Re: mORMot Framework » High-performance frameworks » 2022-07-29 06:23:35

mpv

We pass verification check on TFB smile Now waiting for PR to be merged. If this happens up to tomorrow, then we got benchmark results on read hardware during next intermediate execution (expected to starts in 41 hour)

ab wrote:

Warning: it would break the tests if it is run with the latest release tag.
I just included this feature today.

Yes, to test on specific commit it hash should be pasted here and line uncommented, and next line - commented

#46 Re: mORMot Framework » High-performance frameworks » 2022-07-28 17:02:38

mpv

Nice! And it improves /fortunes from  76701 RPS to 82399 smile
Also I removes unneeded #10 in mustache template - for 15 second test our little mORMot respond 1 233 536 times, so these 11bytes are converted to 12Mb of traffic

I will commit all improvements into - https://github.com/pavelmash/FrameworkBenchmarks/pull/1

#47 Re: mORMot Framework » High-performance frameworks » 2022-07-28 09:57:08

mpv

With latest O(1) changes I've got 94868 RPS vs 93147 RPS before (+2%) for /db and ~490000 vs 485000 RPS for /json. On server hardware result difference should be more visible.
I will continue to investigate perf (today is a crazy day - @#$ russians launch missiles starts from 4:00, some of them landing very close to me)

#48 Re: mORMot Framework » High-performance frameworks » 2022-07-27 18:13:36

mpv

After small profiling of wrk call with 12 thread and 512 connection 

wrk -c 512 -t 12 "http://localhost:8080/db"

using my favorite valgrind  I found what 25% of program time (unexpectedly!) is spends inside FindPendingFromTag called with n ~350 and new event count is ~24

It's either branch predictor problem, or expectation what O(n*m) is small is not true - in my test it is  O(24*350) = 9800

#49 Re: mORMot Framework » High-performance frameworks » 2022-07-26 21:03:52

mpv

I made MR based on 2.0.3780 release with defined  FPCMM_REPORTMEMORYLEAKS. Let's wait for approval

#50 Re: mORMot Framework » High-performance frameworks » 2022-07-26 19:44:21

mpv

Ok, packages is evil...
Package defines 
-dNOSYNDBZEOS
-dNOSYNDBIBX
-dFPCMM_REPORTMEMORYLEAKS
-dFPCMM_SERVER

But when I compile from command line I do not define a FPCMM_REPORTMEMORYLEAKS so

{$ifdef FPCMM_REPORTMEMORYLEAKS_EXPERIMENTAL}
var
  ObjectLeaksCount: integer;

But variable ObjectLeaksCount used only under FPCMM_REPORTMEMORYLEAKS_EXPERIMENTAL condition.

I can wait for fix or can define FPCMM_REPORTMEMORYLEAKSShould I wait for fix or define FPCMM_REPORTMEMORYLEAKS for command line

Board footer

Powered by FluxBB