#1 2015-02-21 16:48:59

Dmitro25
Member
Registered: 2015-02-21
Posts: 19

VariantSaveJSON() not always converts ANSI to UTF-8

Hi.
I use Delphi7.
I use Russian chars for some JSON values.
For the following code:

var
  V1: variant;
begin
  v1 := _Arr([]);
  for i := 0 to 1 do
    v1.Add('Some words in Russian');
  caption := VariantSaveJSON(v1);
end;

as a result caption will contain UTF8-encoded string. But if I change the line inside "for" cycle to

    v1.Add(_Obj(['label', 'Some words in Russian']));

caption will contain an unencoded string.

Offline

#2 2015-02-21 18:51:01

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,238
Website

Re: VariantSaveJSON() not always converts ANSI to UTF-8

All this is very compiler specific.
This is not tied to mORMot itself, but to Delphi 7 limitation.

In Delphi 7, there is no code page support.
And almost all internal mORMot methods expects the string to be a RawUTF8.
So if the method parameter expect a RawUTF8, you need to explicitly enter UTF-8 chars.
If the method parameter is a variant, like v1.Add(), it would store a WideString, so will be encoded as expected with UTF-8 chars.
If the method is a "array of const", then the string is expected to be an RawUTF8 - so _Obj(['some words']) should be already UTF-8 encoded, e.g. like _Obj([StringToUTF8('some Russian')]) or slightly slower _Obj([widestring('some Russian')]).

With Delphi 2007, you may be able to force the source code page to be UTF-8 encoded.
For Delphi 2009 and up, constant strings would be inserted as UnicodeString, with a code page, so would be converted to UTF-8 as expected.

Offline

#3 2015-02-22 03:56:21

Dmitro25
Member
Registered: 2015-02-21
Posts: 19

Re: VariantSaveJSON() not always converts ANSI to UTF-8

Thank you for your answer. I will use typecasting to WideString when string variables can contain national characters.

Offline

#4 2015-02-22 08:02:09

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,238
Website

Re: VariantSaveJSON() not always converts ANSI to UTF-8

Note that StringToUTF8() will be slightly faster than WideString() in such case, since WideString uses a BSTR/Ole memory allocation and no reference count, so is slower than our Copy-On-Write RawUTF8.

Offline

#5 2015-02-22 10:55:24

Dmitro25
Member
Registered: 2015-02-21
Posts: 19

Re: VariantSaveJSON() not always converts ANSI to UTF-8

Thanks again. I have adjusted the program according to your advice.

Offline

Board footer

Powered by FluxBB