You are not logged in.
While profiling one of my app I found IsValidUTF8 function works slow for my use case ( big input data, up to 100Mb). Looking around I found a x50 times faster alternative based on vector instructions - algorithm description and production ready implementation as a part of simdjson library.
Hard to believe, but speed is near x50 times faster than SynCommons.IsValidUTF8 which is good optimized IMHO, and even fasted than a dummy loop (due to CPU branching for each loop circle condition I think)
Iterating 3588 times over 2991876 bytes string (~10Gb)...
Validate PAS to TRUE 10 GB in 11.25s i.e. 909.4 MB/s
Validate SIMD to TRUE 10 GB in 271.06ms i.e. 36.8 GB/s
Dummy while loop 10 GB in 18.53s i.e. 552.2 MB/s
4 iter while loop 10 GB in 14.37s i.e. 712.3 MB/s
Disadvantages - linking to stdc++
Test project and C wrapper for C++ simdjson code is available on the GitHub: see simdjson_pas
Offline
I remember SIMDJson, it was hot on hacker news: https://hn.algolia.com/?q=simdjson
And this simple project of mpv, it's really a simple example for wrapping a C++ project into a C DLL then used by Pascal. Thanks for sharing!
Delphi XE4 Pro on Windows 7 64bit.
Lazarus trunk built with fpcupdelux on Windows with cross-compile for Linux 64bit.
Offline
I guess the "while loops" in your test programs are slow because they are written in the main begin..end. block so their variables are global variable, and never compiled as register.
I suppose if you use sub-function, the "while loop" would be faster.
Yes, simdjson is really impressive.
But it supports only strict JSON, whereas mORMot is able to understand MongoDB exceptions like unquoted field names (which is the default JSON layout between mORMot clients and server).
From TTestCoreProcess.JSONBenchmark mORMot 2 JSON process speed is pretty decent, for a pure pascal JSON parser - probably the fastest on Delphi/FPC:
- JSON benchmark: 500,904 assertions passed 2.38s
IsValidUtf8() in 77.25ms, 1.2 GB/s
IsValidJson(RawUtf8) in 118.63ms, 826.3 MB/s
IsValidJson(PUtf8Char) in 117.26ms, 835.9 MB/s
JsonArrayCount(P) in 111.02ms, 882.9 MB/s
JsonArrayCount(P,PMax) in 107.74ms, 909.8 MB/s
JsonObjectPropCount() in 45.96ms, 1.2 GB/s
TDocVariant in 661.96ms, 148 MB/s
TDocVariant dvoInternNames in 805.23ms, 121.7 MB/s
TOrmTableJson GetJsonValues in 22.94ms, 375.9 MB/s
TOrmTableJson expanded in 38.82ms, 505 MB/s
TOrmTableJson not expanded in 21.54ms, 400.3 MB/s
The TOrmTableJson parser is the one used by our ORM and reaches 500 MB/s which is pretty good in practice - especially for the not expanded mode which is much less verbose, so here the same data is read in 21ms instead of 38ms.
Also its JSON serializing abilities are good: more than 370 MB /s when writing via GetJsonValues() in non-expanded mode.
As reference, here are the same tests run with mORMot 1.18:
- JSON benchmark: 100,203 assertions passed 635.21ms
IsValidUtf8() in 20.94ms, 0.9 GB/s
IsValidJson(RawUtf8) in 25.40ms, 771.7 MB/s
IsValidJson(PUtf8Char) in 27.40ms, 715.3 MB/s
JsonArrayCount(P) in 21.67ms, 904.7 MB/s
JsonArrayCount(P,PMax) in 21.54ms, 910 MB/s
JsonObjectPropCount() in 10.77ms, 1 GB/s
TDocVariant in 171.43ms, 114.3 MB/s
TDocVariant dvoInternNames in 229.24ms, 85.5 MB/s
TSqlTableJson GetJsonValues in 26.60ms, 324.1 MB/s
TSqlTableJson expanded in 46.30ms, 423.3 MB/s
TSqlTableJson not expanded in 25.16ms, 342.6 MB/s
So I guess the work has been good with mORMot 2.
Offline
The validate_utf8 function is light and needs no memory using SIMD. More info: https://lemire.me/blog/2018/05/16/valid … -per-byte/
But as far as JSON needs, mORMot SAX approach uses no memory, but simdjson builds a DOM-like info that takes much (near 5X I think) more memory.
Offline
To be clear: i'm fiine with mormot JSON parser, it is perfect and fits all my needs in terms of memory and performance. The only function I use from simdjson is validate_utf8 as a replacement of IsUTF8Valid.
Last edited by mpv (2021-07-25 19:11:40)
Offline
I think you may get better results trying to rewrite it and prevent dependency.
Offline
Please check https://github.com/synopse/mORMot2/comm … 60317d3097
It is enabled only on Haswell level CPUs with AVX2 + BMI + SSE 4.2.
And on FPC only, since Delphi has no proper AVX2 asm support - even the latest version.
Numbers are very good.
I have also enhanced the pascal version, which is faster than before too.
Also the regression tests now validate that invalid UTF-8 is detected at any position in the input text.
Offline
Very nice.
But I can not compile it with the latest Trunk FPC on Win10 X64. IsValidUtf8Avx2 crashes.
mORMot test fail too on this function.
By disabling the AVX version, Pas version works 3GB/s for me on i9.
Unrelated note: In TDynArrayHasher.FindOrNew, checking Assigned(Compare) fails on FPC trunk. Previous versions work without problem.
Last edited by okoba (2021-07-26 15:04:22)
Offline
Don't use FPC trunk. We don't support it.
It is too much unstable.
Edit: are you using Win64?
I guess there is a problem with the Win64 ABI by now. I validated it only on Linux x86_64.
I will try to fix it.
Edit 2: I confirm the code is not Win64 compatible.
I have enabled this AVX2 code for x86_64 POSIX only - which matches the main usecase of a production server.
Offline
Thanks for the update.
About Trunk, it let me try mORMot code with latest FPC updates as I use mORMot a lot for daily tasks like array, dictionary, Unicode, file and json. And until the latest update to the TDynArrayHasher it worked just fine and passed all the test so it may be a good idea to keep it running and testing latest things.
It is the case for V2, I agree with you on V1 of mORMot as it was always problematic to use it with Trunk. But your updates to V2 made it very compatible and comfortable to use.
Offline
Thanks for the update. Checked it and it works near 20GB/s! Great!
About TDynArrayHasher, it still does not work for FPC even with the new parenthesis, it seems an @ is needed.
I tried to minimize the problem in a new project and included mORMot define too, it compiles correctly, the problem is only happening in the unit. Sorry for the trouble, but I think it worth to be able to compile with the latest version of FPC.
Offline
I have added some new benchmarks, and also optimized the mORMot 2 JSON parser even further.
mORMot 2 JSON parsing performance seems really high - several orders of magnitude faster than the fastest Delphi/FPC libraries which are dwsJSON and JsonDataObjects.
Perhaps I would write a blog article about those numbers.
Edit: the initial numbers were incorrect.
I have fixed TTestCoreProcess.JSONBenchmark and published some new numbers:
https://github.com/synopse/mORMot2/comm … 3ed789fb46
Don't worry, mORMot 2 is till way ahead.
Offline
Great!
Yes mORMot is the very fast and a blog post as an update to previous ones seems nice. May I suggest having an independent demo that clearly shows the benefits and speed? It helps to answer questions like this topic.
Also it may be a good idea to add JsonTools (from this topic) too as it is very clean and seems fast.
Topic: https://forum.lazarus.freepascal.org/in … opic=46533
PS, can you please update the TDynArrayHasher compile issue with trunk? It helps someone like me to keep up with your fast updates while maintaining daily codes.
Last edited by okoba (2021-07-27 20:25:52)
Offline
Thank you for the updates.
Offline
Great work!! Many thanks!
A adopt a new IsValidUTF8 funtcion for mORMot1 (I'm still on mORMot1) - https://github.com/synopse/mORMot/pull/400
Offline
They are at the source level:
https://github.com/synopse/mORMot/pull/400/files
1) I would rather put this into SynTable: SynCommons is already huge... too huge...
2) Reuse the pascal function in the ASMX64AVX branch, which is currently broken so I can't merge it as such.
I could do it on my side, if you prefer, once you have validated on your side that it works as expected, and can be substituted to the external .so library.
Offline
OK, than I merge a #400 into SyNodeCleanup brunch, which i am using to build UnityBase and deploy version on the my testing environment to confirm everything works as expected. Currently my autotests are passed, but on Monday testing team starts works with real use cases and we ensure everything is OK. If so, I will ask you to do the necessary changes SynCommons -> SynTable.
Offline
I have made https://synopse.info/fossil/info/ab50456505 as an official port in SynTable unit.
Hope it helps.
Offline
I found the strange behavior of new ASM code - reproduced only under Windows x64 and only if compiled with -O2 and higher (FPC3.2.0)
After call to IsValidUTF8 in scenarios like
function test(const aStr: RawByteString);
begin
if not IsValidUTF8(pointer(aStr), length(aStr)) then
..
end;
aStr become nil.
Not reproduced under Linux x64 with any optimization level (we already use it on prod under Linux).
@ab - may be you have some ideas why this may happens, because I cant understand yet..
P.S.
In real life string become nil after this line - https://github.com/synopse/mORMot/blob/ … te.pas#L42
Last edited by mpv (2021-09-14 15:10:51)
Offline
IsValidUtf8Avx2() was indeed not Win64 compatible.
Please check https://synopse.info/fossil/info/7db5063372
- also fixed on mORMot 2
Offline
You are right: WIN64ABI is a mORMot 2 specific conditional.
We just need to replace it with MSWINDOWS or WIN64.
Should be fixed now https://synopse.info/fossil/info/962c8e03cc
Offline
Now problem is solved - thank you very much!
I found my CI server uses XeonE5-2640 v2 CPU which do not support AVX2 (E5v2 - only AVX, E5v3 - AVX2) - this is the reason why I caught this error so late.
Offline