#1 2018-12-03 10:33:00

mpv
Member
From: Ukraine
Registered: 2012-03-24
Posts: 1,534
Website

My finding in mORMot while analyse code using valgrind (unix)

While searching for some very hard to reproduce issue in my code I start to use a valgrind - a dynamic code analysis tools.

To run an mORMot based program under valgrind we need to

1) disable HASAESNI define in Synopse.inc, because valgrind is use his own CPU emulator
2) add -gv option to FPC compiler ( can be done on Project-Options-Debugging lazarus dialog)
3) compile project with debugging info and -O1 optimization level

And run our test project

valgrind ./TestSQL3

Currently I found several issues

- one in SynCommons.UnCamelCase is fixed - see pull #159

Other in SynCrypto:

-  in case of non INTEL CPU or if CPU do not support RAND (as valgrind CPU emulator did) FillRandom use a

 threadvar _Lecuyer: TLecuyer; // uses only 16 bytes per thread 

But internal content of TLecuyer is not initialized by compiler (TLecuyer.rs1 & TLecuyer.seedcount contains a random values).

I do not know how to initialize internal content of threadvar - it seems necessary to rewrite this part of the code

- aesencryptx64 reports many warning about "Use of uninitialised value" in asm code. I'm also totaly do not know how to fix it

Offline

#2 2018-12-03 12:40:33

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,182
Website

Re: My finding in mORMot while analyse code using valgrind (unix)

Nice findings!
I have merged your pull request.

Under Windows at least, a threadvar is filled with #0, so word(seedcount)=0 in TLecuyer.Next.
Isn't it the case with Linux?
As a failover, using a global TLecuyer with a lock may be the easiest fix.

Offline

#3 2018-12-03 13:26:31

edwinsn
Member
Registered: 2010-07-02
Posts: 1,215

Re: My finding in mORMot while analyse code using valgrind (unix)

I heard about valgrind  for the first time, what an amazing tool and what an amazing you've done!


Delphi XE4 Pro on Windows 7 64bit.
Lazarus trunk built with fpcupdelux on Windows with cross-compile for Linux 64bit.

Offline

#4 2018-12-03 15:54:53

mpv
Member
From: Ukraine
Registered: 2012-03-24
Posts: 1,534
Website

Re: My finding in mORMot while analyse code using valgrind (unix)

Hmm... On the simple tests with threadvar problem is not reproduced.. So may be this is not threadvar problem....
But if I use mORMot unit  (with undefined HASAESNI) then project below

program Project1;
uses mORMot;
begin
  WriteLn(CurrentServerNonce(false));
end.

give this warning:

==20540== Conditional jump or move depends on uninitialised value(s)
==20540==    at 0x5757A3: SYNCOMMONS$_$TLECUYER_$__$$_SEED$PBYTEARRAY$LONGINT (SynCommons.pas:37116)
==20540==    by 0x575834: SYNCOMMONS$_$TLECUYER_$__$$_NEXT$$LONGWORD (SynCommons.pas:37125)
==20540==    by 0x575B4C: SYNCOMMONS_$$_FILLRANDOM$PCARDINALARRAY$LONGINT$BOOLEAN (SynCommons.pas:37186)
==20540==    by 0x60653F: SYNCRYPTO$_$TAESPRNG_$_GETENTROPY$LONGINT$BOOLEAN$$RAWBYTESTRING_$$_SHA3UPDATE (SynCrypto.pas:13649)
==20540==    by 0x60639F: SYNCRYPTO$_$TAESPRNG_$__$$_GETENTROPY$LONGINT$BOOLEAN$$RAWBYTESTRING (SynCrypto.pas:13667)
==20540==    by 0x606635: SYNCRYPTO$_$TAESPRNG_$__$$_SEED (SynCrypto.pas:13696)
==20540==    by 0x605F98: SYNCRYPTO$_$TAESPRNG_$__$$_CREATE$LONGINT$LONGINT$LONGINT$$TAESPRNG (SynCrypto.pas:13585)
==20540==    by 0x606DAE: SYNCRYPTO_$$_SETMAINAESPRNG (SynCrypto.pas:13857)
==20540==    by 0x487FD5: MORMOT_$$_CURRENTSERVERNONCE$BOOLEAN$$RAWUTF8 (mORMot.pas:43255)
==20540==    by 0x400E82: main (project1.lpr:6)

Additional investigations is required..

Offline

#5 2018-12-03 17:48:20

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,182
Website

Re: My finding in mORMot while analyse code using valgrind (unix)

IMHO the threadvar is initialized to 0 by design, at least in FPC.
See e.g. the CAllocateThreadVars function in FPC cthreads.pp.

Offline

#6 2018-12-09 11:35:58

mpv
Member
From: Ukraine
Registered: 2012-03-24
Posts: 1,534
Website

Re: My finding in mORMot while analyse code using valgrind (unix)

On this weekend I continue to discover the power of valgrind, namely it's profiler part. This is a BRILLIANT software that I was looking for a very long time. Integration with FPC is perfect. Below is a screen of cachegring profiler shown by kcachegrind visualization tool (take a look at Pascal source code with timings)

kcachegrind screen

Just for one day I speed up some parts of my code up to x10.

For example in SyNode I found very unexpected bottleneck - see pull request #164 (can be merged to master) 

FPC WiKi is outdated a little. To  start/stop profiling instead of

 valgrind_control -i on / off

should be

callgrind_control -i on / off

Happy profiling!

Last edited by mpv (2018-12-09 11:38:14)

Offline

Board footer

Powered by FluxBB