#1 2011-09-20 12:31:08

proto
Member
From: Russia, Kostroma
Registered: 2011-09-12
Posts: 31

how search uft8 string in grid?

then i write

procedure TfrmMain.btnSearchClick(Sender: TObject);
var
  r: integer;
begin
  r := Table.SearchValue(edtSearch.Text, 1, btnSearch.Tag, dgDataRecord);
  if r <> 0  then dgDataRecord.Row := r;
end;

if i search english letter or number it works fine, but russian letter not found
if source code i see const aUpperValue: RawUTF8;, but Search: PAnsiChar; why not utf8 search?

Last edited by proto (2011-09-20 12:31:20)

Offline

#2 2011-09-20 17:21:41

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,240
Website

Re: how search uft8 string in grid?

You shall use a RawUTF8 content, therefore:

  r := Table.SearchValue(StringToUTF8(edtSearch.Text), 1, btnSearch.Tag, dgDataRecord);

It won't handle uppercase russian letters, only plain latin 'A'..'Z' letters, in the current implementation.
So the research in russian won't work, even in case-sensitive mode - and, of course, all soundex search (if starting with %) won't work with such characters...

I guess I would have to add a custom search function instead of FindUTF8(), which expects west-european languages only (but handles UTF-8 as expected for those character set).
What about adding a new parameter to call the much slower, but accurate Unicode text comparison? But this would be a whole rewrite of the FindUTF8() function.

Note also that the Client instance expect a TSQLRest class, not a TSQLRecord.
There was no check for this in the code, and may lead into some GPF. It's now fixed.

Offline

#3 2011-09-20 18:50:55

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,240
Website

Re: how search uft8 string in grid?

That's it, I've added a new UnicodeComparison boolean parameter in TSQLTable.SearchValue to handle property non WinAnsi (code page 1252) characters.

See http://synopse.info/fossil/info/de605b4011

It may solve your issue with this method.
This will be much slower on huge grid, but should do the work.
Don't forget to use StringToUTF8(edtSearch.Text) to provide an UTF-8 encoded text.

Offline

#4 2011-09-21 05:31:28

proto
Member
From: Russia, Kostroma
Registered: 2011-09-12
Posts: 31

Re: how search uft8 string in grid?

i try use UnicodeComparison, but search dont work.
0 result if i search russian or english letter (((

i solve problem create 2 search:

Rec := TSQLDataRecord.Create(Database, 'Field like "%' + edtSearch.Text + '%"'); //first i search Field
r := Table.SearchValue(IntToStr(Rec.ID), 1, 0, nil); //second i search id in table
if r <> 0  then dgDataRecord.Row := r;
ab wrote:

Don't forget to use StringToUTF8(edtSearch.Text) to provide an UTF-8 encoded text.

i use Delphi 2010 - all strings in unicode

Last edited by proto (2011-09-21 05:33:50)

Offline

#5 2011-09-21 06:24:53

noobies
Member
Registered: 2011-09-13
Posts: 139

Re: how search uft8 string in grid?

i too try search russian letter

Rec := TSQLUser.Create(globalClient, 'FirstName like "%%%"', [UpperCase(Search.Text)]);

or

Rec := TSQLUser.Create(globalClient, 'FirstName like "%%%"', [UpperCaseU(Search.Text)]);

0 results

but if i use standart function uppercase:

function AnsiUpperCase(const S: string): string;
{$IFDEF MSWINDOWS}
var
  Len: Integer;
begin
  Len := Length(S);
  SetString(Result, PChar(S), Len);
  if Len > 0 then 
    CharUpperBuff(PChar(Result), Len);
end;
{$ENDIF MSWINDOWS}
{$IFDEF POSIX}
begin
  Result := WideUpperCase(S);
end;
{$ENDIF POSIX}

work with unicode string fine

Rec := TSQLUser.Create(globalClient, 'FirstName like "%%%"', [AnsiUpperCase(Search.Text)]);

Last edited by noobies (2011-09-21 06:26:16)

Offline

#6 2011-09-21 07:29:04

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,240
Website

Re: how search uft8 string in grid?

Yes, this is no issue, but exactly what is stated by the documentation:

/// fast conversion of the supplied text into uppercase
// - this will only convert 'a'..'z' into 'A'..'Z' (no NormToUpper use), and
// will therefore by correct with true UTF-8 content
function UpperCase(const S: RawUTF8): RawUTF8;


/// fast conversion of the supplied text into 8 bit uppercase
// - this will not only convert 'a'..'z' into 'A'..'Z', but also accentuated
// latin characters ('e' acute into 'E' e.g.), using NormToUpper[] array
// - it will convert decode the supplied UTF-8 content to handle more than
// 7 bit of ascii characters 
function UpperCaseU(const S: RawUTF8): RawUTF8;

So both functions only works with 'a'..'z' characters:
- UpperCase with only 'a'..'z';
- UpperCaseU() with 'a'..'z' and accentuated WinAnsi characters (like 'à' or 'é').
They will not handle russian characters.

I've just added an UpperCaseUnicode() and LowerCaseUnicode() functions which handle directly RawUTF8 content, which may be a bit faster than AnsiUpperCase for this purpose.
See http://synopse.info/fossil/info/d802de9fa0

Offline

Board footer

Powered by FluxBB