#1 2014-02-14 20:00:17

louis_riviera
Member
Registered: 2013-09-23
Posts: 61

UTF8 -> String

Does SynCommons.pas have a faster routine than Delphi for PAnsiChar, AnsiString to String conversions? Converting PAnsiChar to String is very costly in Delphi XE5.

Offline

#2 2014-02-14 21:57:13

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,206
Website

Re: UTF8 -> String

Yes all text conversion functions in Syncommons.pas are very optimized.
Take a look at RawUTF8ToString() and all associated functions and classes.
For instance:

/// convert any UTF-8 encoded buffer into a generic VCL Text
// - it's prefered to use TLanguageFile.UTF8ToString() in mORMoti18n,
// which will handle full i18n of your application
// - it will work as is with Delphi 2009+ (direct unicode conversion)
// - under older version of Delphi (no unicode), it will use the
// current RTL codepage, as with WideString conversion (but without slow
// WideString usage)
function UTF8DecodeToString(P: PUTF8Char; L: integer): string; overload;
  {$ifdef UNICODE}inline;{$endif}

/// convert any UTF-8 encoded buffer into a generic VCL Text
procedure UTF8DecodeToString(P: PUTF8Char; L: integer; var result: string); overload;

Internally, it uses:

function UTF8ToWideChar(dest: pWideChar; source: PUTF8Char; sourceBytes: PtrInt=0): PtrInt;

which is designed to be faster than System.UTF8Decode().
For instance, it will handle simple Latin chars (<#127) by groups of four!
And it is not only fast, but complete, handling UTF-16 surrogates and UTF-8 content verification.

But in mormot itself we do not use string but our RawUtf8 type.

Offline

Board footer

Powered by FluxBB