#1 2021-01-21 19:21:18

BenTen
Member
Registered: 2021-01-19
Posts: 11

Case-insensitive search Cyrillic in SQLite3

Hi.

I always use the default codepage Win-1251 for my projects.
In the Example 02, if field "Name" uses the Latin alphabet, then the search is made case-insensitive, and if the Cyrillic alphabet, then the search is only case-sensitive.
Help me please. How can I ensure that the Cyrillic alphabet is also compared case-insensitively?
Thanks.

Example 02 - Embedded SQLite3 ORM

procedure TForm1.FindButtonClick(Sender: TObject);
var
  Rec: TSQLSampleRecord;
  SearchStr: RawUTF8;
begin
  SearchString := StringToUTF8(NameEdit.Text);
  Rec := TSQLSampleRecord.Create(Database,'Name=?',[SearchString]);
  try
    if Rec.ID=0 then
      QuestionMemo.Lines.Add('Not found')
    else
    begin
      Created.Text := Rec.CreationString;
      QuestionMemo.Lines.Add(UTF8ToString(Rec.Question));
    end;
  finally
    Rec.Free;
  end;
end;

Offline

#2 2021-01-22 01:03:34

Vitaly
Member
From: UAE
Registered: 2017-01-31
Posts: 168
Website

Re: Case-insensitive search Cyrillic in SQLite3

As a quick workaround, I would suggest trying to change the collation to WIN32NOCASE for the search field in your DB.

Also, this might be helpful:
https://synopse.info/files/html/Synopse … #TITLE_155
https://synopse.info/forum/viewtopic.php?id=4342

Offline

#3 2021-01-22 10:32:15

BenTen
Member
Registered: 2021-01-19
Posts: 11

Re: Case-insensitive search Cyrillic in SQLite3

Many thanks for the help.
Setting WIN32NOCASE mode solved my issue.
IMPORTANT!
It is necessary to set the Collation columns mode ONLY before creating a new database

procedure TForm1.FormCreate(Sender: TObject);
begin
  Model := CreateSampleModel;
  // It is necessary to set the Collation columns mode ONLY before creating a new database
  Model.SetCustomCollationForAll(sftUTF8Text, 'WIN32NOCASE');
  Database := TSQLRestServerDB.Create(Model,ChangeFileExt(ExeVersion.ProgramFileName,'.db3'));
  TSQLRestServerDB(Database).CreateMissingTables;
end;

Offline

#4 2021-01-22 11:14:54

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,220
Website

Re: Case-insensitive search Cyrillic in SQLite3

Just a warning: even if WIN32NOCASE works, there is no such thing as an universal Unicode collation.
Please read http://sqlite.1065341.n5.nabble.com/Uni … 24628.html

So if you move a DB from one system to another, depending on the local, you may break the indexes...
It won't happen for Win1251 for sure, but there is no universal comparison system, without knowing the exact locale: depending on the actual language, some Unicode codepoint may not compare the same!

I may add a Win1251 explicit collation... since there is already a Win1252 dedicated version...

Offline

#5 2021-01-22 14:01:00

Vitaly
Member
From: UAE
Registered: 2017-01-31
Posts: 168
Website

Re: Case-insensitive search Cyrillic in SQLite3

ab wrote:

I may add a Win1251 explicit collation... since there is already a Win1252 dedicated version...

Personally, I'm not interested in win1251 since I'm not using it in the development, although I see questions/troubles concerning it on the forum from time to time.
Just curious, what will be the disadvantage of adding win1251 collation for other non-cyrillic developers? smile

Offline

#6 2021-01-22 17:28:40

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,220
Website

Re: Case-insensitive search Cyrillic in SQLite3

I guess we may make an internal efficient case-insensitive comparer for most simple code pages, working directly at UTF-16 level.

Offline

#7 2021-01-22 18:09:45

BenTen
Member
Registered: 2021-01-19
Posts: 11

Re: Case-insensitive search Cyrillic in SQLite3

Arnaud wrote:

I may add a Win1251 explicit collation... since there is already a Win1252 dedicated version...

Thank you, Arnaud. It will be very good.
In addition, I do not plan to move the database between systems with different locales.
I just started learning mORMot framework yesterday and I plan to transfer one my simple project implemented on Datasnap to mORMot.
If all goes well, I will use mORMot for new projects.

Offline

#8 2021-01-23 04:55:28

edwinsn
Member
Registered: 2010-07-02
Posts: 1,215

Re: Case-insensitive search Cyrillic in SQLite3

ab wrote:

So if you move a DB from one system to another, depending on the local, you may break the indexes...

@ab, what's the workaround to handle it after moving across locals? Delete and rebuild the indexes?


Delphi XE4 Pro on Windows 7 64bit.
Lazarus trunk built with fpcupdelux on Windows with cross-compile for Linux 64bit.

Offline

#9 2021-01-23 10:50:04

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,220
Website

Re: Case-insensitive search Cyrillic in SQLite3

Yes, you would need to rebuild the indexes to be sure the values would match.

This is why a hard-coded pure mORMot Unicode comparison could make sense.
It would be cross-platform and consistent over all systems, as SQlite3 should.
I will include it into mormot.core.unicode.pas from mORMot 2.

Offline

#10 2021-01-23 13:13:41

edwinsn
Member
Registered: 2010-07-02
Posts: 1,215

Re: Case-insensitive search Cyrillic in SQLite3

ab wrote:

This is why a hard-coded pure mORMot Unicode comparison could make sense.
It would be cross-platform and consistent over all systems, as SQlite3 should.
I will include it into mormot.core.unicode.pas from mORMot 2.

Yes! When the db is used as a document file format, the user might copy the file across different locals.
And yes, that hard-coded collation must be **unicode-ready**.
Thanks for your effort!


Delphi XE4 Pro on Windows 7 64bit.
Lazarus trunk built with fpcupdelux on Windows with cross-compile for Linux 64bit.

Offline

#11 2021-01-23 18:48:00

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,220
Website

Re: Case-insensitive search Cyrillic in SQLite3

I have implement in mORMot 2:

1) OS independent Unicode 10.0 process - https://github.com/synopse/mORMot2/comm … e98639793e

2) New SQlite3 UNICODECASE collation supporting full Unicode - https://github.com/synopse/mORMot2/comm … 4a1af602fb

3) At ORM level, just inherit from TOrmUnicodeNoCase class to use this new collation for RawUtf8 properties - also for our in-memory engine.

The Unicode tables hardcoded are only 20KB for a very efficient branchless comparison. smile

(no plan to implement in mORMot 1 yet, since SynCommons.pas is already too big)

Offline

#12 2021-01-24 06:07:16

edwinsn
Member
Registered: 2010-07-02
Posts: 1,215

Re: Case-insensitive search Cyrillic in SQLite3

Wow! Amazing work as usual. How can you implement it so quickly!

Although it's a pity no mORMot 1 support...


Delphi XE4 Pro on Windows 7 64bit.
Lazarus trunk built with fpcupdelux on Windows with cross-compile for Linux 64bit.

Offline

#13 2021-01-26 16:36:38

Leslie7
Member
Registered: 2015-06-25
Posts: 248

Re: Case-insensitive search Cyrillic in SQLite3

Great news!
Not to be greedy, but the question comes kind of naturally: why v10, why not the latest v13?

Offline

#14 2021-01-26 17:36:55

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,220
Website

Re: Case-insensitive search Cyrillic in SQLite3

Because I started from Bero's tables, which has Unicode 10.0 support.

To be fair, I doubt Unicode 13.0 has much more to offer in practice: some EmoJi, some historical graphics...
If there is a really need, we may add them.
The main problem won't be our framework, but to find the fonts able to display those charsets, and - even more - the people able to read and write them! big_smile

Offline

#15 2021-01-26 18:17:07

Leslie7
Member
Registered: 2015-06-25
Posts: 248

Re: Case-insensitive search Cyrillic in SQLite3

I suspected something like this. smile Anyhow, excellent addition to the framework. smile smile

The way you care for mORMot and it's users is amazing. Your love for your work is kind of contagious. big_smile

Last edited by Leslie7 (2021-01-26 18:20:47)

Offline

Board footer

Powered by FluxBB