Note that there is an Unicode comparison embedded in mORMot 2.
]]>Depends: libicu60 | libicu66
- rpm/SPECS (CentOS, RHEL7 / 8 OEL 7 / 8)
Requires: libicu >= 60
And the cwstring is not needed any more, even for the regression tests.
No ICU widestringmanager yet, but I guess it may not be mandatory yet.
]]>But what I will do today is implement ICU, also for mORMot 1.18.
Also implement the FPC widestringmanager using ICU instead of iconv.
ICU is much faster and easier to use.
{$ifdef MSWINDOWS}
{$I SynDprUses.inc}
{$else}
// prevent to use a cwstring: LC_NUMBERS problem https://synopse.info/forum/viewtopic.php?pid=33631#p33631
cthreads,
SynFPCCMemAligned,
{$endif}
So we can modify only a mORMot2 code...
]]>I will see if I can call directly iconv() with no dependency to this unit. There is some code for Kylix calling the libc, and I guess it could be enough for our purpose.
But I remember that ICU is much faster than iconv(). Even .Net 5 is using ICU on Windows now! https://docs.microsoft.com/en-us/dotnet … zation-icu
So perhaps looking into ICU as alternative for mORMot 2.
I will try to implement the following:
1) set a conditional for cwstring in SynDprUses.inc
2) on FPC/POSIX, use the cwstring API is available (i.e. widestringmanager has been set), otherwise, call directly iconv.
3) eventually, for mORMot2, look into ICU.
Those UTF-16 conversion codes are very unlikely to be called in UTF-8 processing, unless you explicitly use non fixed with charsets (like Chinese or Japan).
]]>cwstring.pp initialization calls a
setlocale(LC_ALL,'');
This force program to read a locale settings from environment variables (LC_ALL and family). So all libc functions and all C libraries I load in program starts using these locales settings.
Sometimes this is not e behavior I expect.
For example I use a third-party C library what internally uses
sprintf(buffer, "%f", sample->r_value);
to output a decimal values to a buffer.
In case I adds a SynDprUses.inc (adds a cwstring) in a uses section of my program then cwstring initialize a LC_NUMERIC variable to use a comma as a decimal separator (my laptop configured to use Ukrainian locale) and I got an unexpected numbers in the text file.
So, my questions are:
1) Do we really need a cwstring ? As far as I understand the answer is "No" - AnsiToUTF8 и UTF8ToAnsi is intercepted on SynCommons, so FPC functions are not used.
2) If we need it, please, let's add some DEFINE what allows me to disable it in SynDprUses
uses
FastMM4, FastCode, FastMove,
Windows, Forms,
Reading mORMot examples, I found
// first line of uses clause must be {$I SynDprUses.inc}
uses
{$I SynDprUses.inc}
Forms,
I did a comparision of mORMOt's version of FastMM4 and liked it. I will use the mORMot version.
But, what about FastCode and FastMove units? The optimizations of mORMot code will have better performance that the old FastCode, FastMove units? It's a drop-in replacement?
]]>