#1 2010-11-29 13:23:00

moggco
Member
Registered: 2010-09-27
Posts: 9

About the file format

All file is winAnsi. When Delphi XE open some file , such as SynSelfTests.pas, XE cannot display correctly, and many test is fail for illegible characters.
So I must open the source file with VIM, copy&paste to XE, then save as utf8. After fixed some hash code and added some encode of string, all tests is pass.

sample:

procedure TTestSQLite3Engine.DirectAccess;
procedure InsertData(n: integer);
var i: integer;
    s, ins: RawUTF8;
    R: TSQLRequest;
begin
  // this code is a lot faster than sqlite3 itself, even if it use Utf8 encoding:
  // -> we test the engine speed, not the test routines speed :)
 ins := 'INSERT INTO People (FirstName,LastName,Data,YearOfBirth,YearOfDeath) VALUES (''';
 for i := 1 to n do begin
   str(i,s);
   // we put some accents in order to test UTF-8 encoding
   R.Prepare(Demo.DB,ins+'Salvador'+s+''', ''Dali'', ?, 1904, 1989);');
   R.Bind(1,PAnsiChar(WinAnsiToUtf8('aéà?')),length(WinAnsiToUtf8('aéà?')){4}); // Bind Blob
procedure TTestSQLite3Engine._TSQLRestClientDB;
...

Check(Data=WinAnsiToUtf8('aéàç'));

......

Check((DataS.Size=7{4}) and (PCardinal(DataS.Memory)^=$C3A9C361{E7E0E961}));

My question is how to deal with such thing simpler and safer ?

Last edited by moggco (2010-11-29 13:25:41)

Offline

#2 2010-11-29 16:46:36

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,655
Website

Re: About the file format

Good point; these characters are only used for testing purpose.
I could try to change them into #... decimal const, in order to be UTF-8 ready.

But you can change the encoding of the file, as far as I remember, by a right click on the file in the editor, then selecting WinAnsi/1252 code page.
Then no modification of the file itself is needed. And the IDE should remember this setting.

Offline

#3 2010-12-01 13:37:04

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,655
Website

Re: About the file format

I've uploaded a version in pure 7 bit ASCII code.

All WinAnsi characters are encoded as #232 const or such.

All source files should now load with no encoding problem.

Offline

#4 2010-12-07 08:15:47

moggco
Member
Registered: 2010-09-27
Posts: 9

Re: About the file format

ab wrote:

I've uploaded a version in pure 7 bit ASCII code.

All WinAnsi characters are encoded as #232 const or such.

All source files should now load with no encoding problem.

That not a good solution in my machine.
Yes, that Delphi IDE load the source is no encoding problem, but as a widestring, its too bad.
Such as

s1: string;

s1 := ''', ''Morse'', ''a'#233#224#231''', 1791, 1872);'

You code page is 1252, but my locale is 2052 and code page is 936, so delphi treat #233#224 as a widechar, and #231 as $003F that is '?',  its illegal character in MBCS of my windows .  The test can't passed.

The damn code conversion. I don't know how to deal with it.

Last edited by moggco (2010-12-07 09:45:22)

Offline

#5 2010-12-07 14:15:24

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,655
Website

Re: About the file format

Which version of Delphi are you using?

String constant must be perhaps forced as WinAnsiString. Does WinAnsiString('....'#233#224'...') work? Or perhaps const aTest1: WinAnsiString = '....'#233#224'...'; ?

Offline

#6 2010-12-09 03:53:13

moggco
Member
Registered: 2010-09-27
Posts: 9

Re: About the file format

ab wrote:

Which version of Delphi are you using?

String constant must be perhaps forced as WinAnsiString. Does WinAnsiString('....'#233#224'...') work? Or perhaps const aTest1: WinAnsiString = '....'#233#224'...'; ?

Delphi XE.

Use

const
    aTest1: WinAnsiString = ''', ''Morse'', ''a'#233#234#231''', 1791, 1872);';

the result same as above.

Maybe there are two solutions:
1. save all source files encode as utf8  but not 7 bit ASCII code.
2. use wide char for Delphi 2009~2011, that is #233 replace with #$00E9

Offline

#7 2010-12-09 04:46:06

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,655
Website

Re: About the file format

moggco wrote:

1. save all source files encode as utf8  but not 7 bit ASCII code.
2. use wide char for Delphi 2009~2011, that is #233 replace with #$00E9

I think first solution will be incompatible with Delphi prior to Delphi 2007.
For this reason, 1. is not an option; 2. I'll try #$00E9 under Delph 6/7/2007. But I don't get why it's different than #233.

Offline

Board footer

Powered by FluxBB