You are not logged in.
Pages: 1
Hello
If a RawUTF8 field is defined without an index modifier mORMotVCL sets the DataSize to 1 in
procedure TSynSQLTableDataSet.InternalInitFieldDefs;
.
.
.
.
sftUTF8Text: begin
DataSize := fTable.FieldLengthMax(F,True); // <--- Here 1 is returned if no MaxSize defined
{$ifndef UNICODE} // for Delphi 2009+ TWideStringField = UnicodeString!
if fForceWideString then
DBType := ftWideString else
{$endif}
DBType := ftDefaultVCLString;
end;
I propose the following change :
procedure TSynSQLTableDataSet.InternalInitFieldDefs;
.
.
.
.
sftUTF8Text: begin
DataSize := fTable.FieldLengthMax(F,false);
if DataSize = 0 then // variable length
DataSize := dsMaxStringSize; // this is the maximum size DB unit can handle.
{$ifndef UNICODE} // for Delphi 2009+ TWideStringField = UnicodeString!
if fForceWideString then
DBType := ftWideString else
{$endif}
DBType := ftDefaultVCLString;
end;
Greetings
Well, for Delphi the patch ist save, as it changes nothing.
For Lazarus the thing is also clear, it is ALWAYS UTF8. All visible components of the LCL expect an UTF8-String, independant of the version. But I do not know the mORMot-code enough to say if this is true for all the places where codepage-conversion is used. With my proposition it should be easy to change the call from CurrentAnsiConvert to SystemAnsiConvert.
You are right that for FPC things are little more complicate, as the RTL switched recently from Ansi-encoded to UTF8.
On the other hand the existing code is not safe for everything other than Windows, as it always assumes fon non-Windoes-OS that the codepage is 1252, which is rarely the case. Problem is that only Windows knows about ACP-codes.The standard procedure to find the system encoding under Unix-like OS'es (like Linux, Android, OSx) is
{$IFDEF Unix}
function GetSystemEncoding: string;
var
Lang: string;
begin
lang := GetEnv('LC_ALL');
if Length(lang) = 0 then
begin
lang := GetEnv('LC_MESSAGES');
if Length(lang) = 0 then
lang := GetEnv('LANG');
end;
i:=pos('.',Lang);
if (i>0) and (i<=length(Lang)) then
Result := copy(Lang,i+1,length(Lang)-i)
else
Result := 'UTF-8'
end;
{$ELSE}
begin
Result := 'UTF-8';
end;
{$ENDIF}
But then you have a string which descibes the character encoding, not an ACP-code. This can be solved, but it can lead to a codepage that is not supported under mORMot.
I have wrtten some years ago a set of units to support ALL codepages for which a Unicode-description exists (that are some more than Windows knows). It has the ability to do codepage conversion internally, (direct conversion between different codepages, multibyte codepage support (asian languages and so on), EBCDIC support, Upper- and Lowercase support) but can also fall back to system calls (iconvenc under *nix). It also has a tool to generate pascal source code from a unicode description file that can then be integrated in the project.
If you are interested in a more global support for codepages I could update these units. But there is some work to do, especially adapt it to strings that support a codepageinfo in the header, and make the whole thing compile unter Delphi.
Good Morning
CurrentAnsiConvert is initalized always with a converter to the current used Windows Codepage. For FPC and Lazarus this is not correct as Lazarus (and therefore all the visible components like Grids etc) use UTF8 encoded strings. As CurrentAnsiConvert is used all around in the code I suppose the following changes.
1. Insert a new variable SystemAnsiConvert in SynCommons
/// global TSynAnsiConvert instance to handle current system encoding
// - this is the encoding as used by the AnsiString Delphi, so will be used
// before Delphi 2009 to speed-up VCL string handling (especially for UTF-8)
// - as FPC and Lazarus use UTF8 encoding this is initalized with TSynAnsiUTF8
// - this instance is global and instantied during the whole program life time
CurrentAnsiConvert: TSynAnsiConvert;
/// global TSynAnsiConvert instance to handle current system encoding
// - this is the encoding as used by the System
// - this instance is global and instantied during the whole program life time
SystemAnsiConvert: TSynAnsiConvert;
2. Changes in TSynAnsiConvert.Engine
class function TSynAnsiConvert.Engine(aCodePage: cardinal): TSynAnsiConvert;
var i: integer;
begin
if SynAnsiConvertList=nil then begin
GarbageCollectorFreeAndNil(SynAnsiConvertList,TObjectList.Create);
SystemAnsiConvert := TSynAnsiConvert.Engine(GetACP);
{$ifdef FPC}
CurrentAnsiConvert := TSynAnsiConvert.Engine(CP_UTF8) as TSynAnsiUTF8;
{$else}
CurrentAnsiConvert := TSynAnsiConvert.Engine(GetACP);
{$endif}
WinAnsiConvert := TSynAnsiConvert.Engine(CODEPAGE_US) as TSynAnsiFixedWidth;
UTF8AnsiConvert := TSynAnsiConvert.Engine(CP_UTF8) as TSynAnsiUTF8;
end;
If somewhere in the code where CurrentAnsiConvert is used, but in fact the system code page is ment, this should make it far easy to change the source
Greetings
Yup.. thats it.
Thanks
Hello
mORMoti18n is incompatible with FPC/Lazarus because of the different resourceformat.
mORMotUI uses mORMoti18n but de facto uses only U2S and S2U from this unit.
I propose to change all references in mORMotUI from U2S to UTF8toString and from S2U to StringtoUTF8 in SynCommons and remove the dependency to mORMoti18n.
In consequence mORMotUI is usable under FPC/Lazarus
Greetings
Pages: 1