#51 2023-02-03 20:59:28

mpv
Member
From: Ukraine
Registered: 2012-03-24
Posts: 1,401
Website

Re: High-Performance Frameworks

With NumTinyBlockArenasPO2 = 7 instead of 6 result is 327К
CPU load in user space is ~10% higher than when using libc in both cases

Flags: BOOSTER  assumulthrd smallpools perthrd erms                            
Small:  blocks=3K size=309KB (part of Medium arena)                            
Medium: 60MB/60MB  sleep=15K                                                   
Large:  0B/640KB  sleep=0                                                      
Total Sleep: count=15K                                                         
Small Getmem Sleep: count=4                                                    
288=4                                                                          
Small Blocks since beginning: 239M/29GB (as small=42/46 tiny=1K/2032)          
48=91M  112=38M  80=27M  128=18M  32=14M  96=9M  64=9M  144=4M                 
160=4M  256=4M  416=3M  880=3M  1264=3M  272=2M  1376=485K  960=475K           
Small Blocks current: 3K/309KB                                                 
48=2K  64=427  352=200  32=87  128=79  112=73  80=48  96=21                    
192=14  416=8  576=7  880=7  288=6  160=5  736=5  624=4

Offline

#52 2023-02-03 21:05:44

mpv
Member
From: Ukraine
Registered: 2012-03-24
Posts: 1,401
Website

Re: High-Performance Frameworks

Memory usage statistic

//libc
Maximum resident set size (kbytes): 28896
Minor (reclaiming a frame) page faults: 12867
Voluntary context switches: 5888357
Involuntary context switches: 5049

//x64mm (NumTinyBlockArenasPO2 = 7)
Maximum resident set size (kbytes): 124380              
Minor (reclaiming a frame) page faults: 44196          
Voluntary context switches: 5220211                    
Involuntary context switches: 8087

Offline

#53 2023-02-03 21:32:51

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 13,504
Website

Re: High-Performance Frameworks

Great!

Please try in FPCMM_BOOSTER mode with https://github.com/synopse/mORMot2/commit/412fd883
It now has 128 arenas, and a bigger number of pools to fed from.

But of course, as you detected, it consumes more RAM to initialize its internal pools.
Some memory is lost in the process, if the memory does not remain allocated, but has very quick getmem/freemem (as in this server benchmark).

Offline

#54 2023-02-05 12:02:37

mpv
Member
From: Ukraine
Registered: 2012-03-24
Posts: 1,401
Website

Re: High-Performance Frameworks

327K RPS for /fortunes. Memory consumption is higher

Flags: BOOSTER  assumulthrd smallpools perthrd erms                                
Small:  3K/309KB  including tiny<=256B arenas=128 pools=95                         
Medium: 126MB/126MB  sleep=2K                                                      
Large:  0B/640KB  sleep=0                                                          
Total Sleep: count=2K                                                              
Small Getmem Sleep: count=1                                                        
288=1                                                                              
Small Blocks since beginning: 244M/29GB (as small=42/46 tiny=1K/2032)              
48=93M  112=39M  80=28M  128=18M  32=14M  96=9M  64=9M  160=4M                     
144=4M  256=4M  416=3M  880=3M  1264=3M  272=2M  1376=509K  960=488K               
Small Blocks current: 3K/309KB                                                     
48=2K  64=426  352=200  32=87  128=80  112=73  80=48  96=21                        
192=14  416=8  576=7  880=7  288=6  736=5  672=4  160=4

Maximum resident set size (kbytes): 271852                                         
Minor (reclaiming a frame) page faults: 77196                                      
Voluntary context switches: 5309185                                                
Involuntary context switches: 7768

Offline

#55 2023-02-05 13:40:30

mpv
Member
From: Ukraine
Registered: 2012-03-24
Posts: 1,401
Website

Re: High-Performance Frameworks

Using -O4 optimization level (never use it before because of "beware" notes) slightly increases performance  (+44k for json for example) and pass all tests.
Also tries Whole Program Optimization - it's decrease executable size from 5Mb to 3Mb but without visible performance changes (compared to -O4)

@ab - how do you think - can we use -O4 for TFB (I'm afraid of accidental falls)?

Offline

#56 2023-02-05 13:47:43

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 13,504
Website

Re: High-Performance Frameworks

Makes sense: only more memory consummed, with not less collision nor sleep.
So I will revert the previous commit to keep the memory lower - already more than glibc.
https://github.com/synopse/mORMot2/commit/19bcf72c

And for the TFB benchmarks, we would rather use the glibc MM.

And I never tested -O4 and I doubt there is any benefit of using it.

Offline

Board footer

Powered by FluxBB