#1 2016-08-08 13:28:39

array81
Member
From: Italy
Registered: 2010-07-23
Posts: 411

Search engine

In my application I use mORMot Framework to store some record with text fields.
Now I want add a smart search about these text, something like google for webpage: I have some words or phrases and I need sort by best result (based on relevance), do you know some library about this? It would be fine also if it is does not integrate with mORMot Framework (I can extract data with mORMot Framework then use library to order the result).
I'm looking for something open source or free to use with delphi.

Offline

#2 2016-08-08 13:42:31

mpv
Member
From: Ukraine
Registered: 2012-03-24
Posts: 1,534
Website

Re: Search engine

You already have everything you need in mORMot smile  See FTS mORMot documentation

Offline

#3 2016-08-08 22:53:22

array81
Member
From: Italy
Registered: 2010-07-23
Posts: 411

Re: Search engine

mpv wrote:

You already have everything you need in mORMot smile  See FTS mORMot documentation

This sound good. Only I thing is not clear. I should duplicate my text in regular TSQLRecord table and in associated TSQLRecordFTS3 or store it only in TSQLRecordFTS3. What is the good approach?

Offline

#4 2016-08-09 07:22:52

mpv
Member
From: Ukraine
Registered: 2012-03-24
Posts: 1,534
Website

Re: Search engine

In my application I have several TSQLRecord classes (for example Persons, Firms, Documents), but all of them duplicate a data what describe an entity to the one TSQLRecordFTS3, because I need to search all of them at once. But this is my use-case.

In any case regular query to FTS table always perform a full scan, so if you need both regular and FTS queries, it is a good idea to separate a data table and FTS table.. See 1.4. Simple FTS Queries sqlite docs

Last edited by mpv (2016-08-09 07:28:32)

Offline

#5 2016-08-09 07:42:12

array81
Member
From: Italy
Registered: 2010-07-23
Posts: 411

Re: Search engine

The Stemming working well also with no-english language? In documentation I can read "is the way the english text is parsed for creating the word index from raw text".

Offline

#6 2016-08-09 08:49:53

mpv
Member
From: Ukraine
Registered: 2012-03-24
Posts: 1,534
Website

Re: Search engine

I'm use a custom tokenizer for cyrilic(Russian/Ukrainian). Don't know about other languages

Offline

#7 2016-08-09 10:23:39

array81
Member
From: Italy
Registered: 2010-07-23
Posts: 411

Re: Search engine

Can you tell more about custom tokenizer?

Is it possible create custom tokenizer for any language? How?

Offline

#8 2016-08-09 10:44:24

mpv
Member
From: Ukraine
Registered: 2012-03-24
Posts: 1,534
Website

Re: Search engine

Writing of a custom tokenizer is a HUGE work. It require linguistics knowledge and so on. It most case build-in porter is enough. See more details here https://www.sqlite.org/fts3.html#tokenizer

Last edited by mpv (2016-08-09 10:44:37)

Offline

#9 2016-08-10 09:51:41

edwinsn
Member
Registered: 2010-07-02
Posts: 1,215

Re: Search engine

I own this lib but haven't used it out side of DISQLITE3: http://www.wikitaxi.org/delphi/products/stemmer/index

But I guess you  can integrate it with mORMot.


Delphi XE4 Pro on Windows 7 64bit.
Lazarus trunk built with fpcupdelux on Windows with cross-compile for Linux 64bit.

Offline

#10 2017-02-12 14:34:24

dolfin
Member
Registered: 2015-09-11
Posts: 22

Re: Search engine

I am developing application with fts feature using mormot framework. The key problem is that I need stemmer for russian and ukrainian languages.
What is the proper way to load custom stemmer dll for sqlite using mormot framework?

Offline

#11 2017-02-12 14:49:24

mpv
Member
From: Ukraine
Registered: 2012-03-24
Posts: 1,534
Website

Re: Search engine

You need to allow custom tokenizer by call to EnableCustomTokenizer, and when execute a statement SELECT fts3_tokenizer during just after connecting to database

DB.EnableCustomTokenizer;

procedure TMyConnection.RegisterExternalTokenizer(const aTokenizer: RawUTF8;
  const AImplementation: sqlite3_tokenizer_module_ptr);
var
  FStmt: TSQLDBStatement;
begin
  FStmt := NewStatement();
  try
    FStmt.Prepare('SELECT fts3_tokenizer(?, ?)', false);
    FStmt.BindTextU(1, aTokenizer);
    FStmt.BindBlob(2, @AImplementation, sizeof(Fsqlite3_tokenizer_module_ptr));
    FStmt.ExecutePrepared;
  finally
    FStmt.Free;
  end;
end;

Offline

#12 2017-02-12 17:59:39

dolfin
Member
Registered: 2015-09-11
Posts: 22

Re: Search engine

Thanks for clarification! But I still have no progress with my task. I found a stemmer for russian and ukrainian languages (https://github.com/ermakovpetr/stemka), but I don't know how to make it work with mormot and sqlite. Could you please give an advice regarding this problem?

Offline

#13 2017-02-13 16:15:22

mpv
Member
From: Ukraine
Registered: 2012-03-24
Posts: 1,534
Website

Re: Search engine

You need to manually implement all functions, required by sqlite3_tokenizer_module structure. Inside you can use a stemka, but this required a LOT of code. I recommend you to find another alternative, for example - elastic search as a external service

Offline

#14 2017-04-01 13:30:12

array81
Member
From: Italy
Registered: 2010-07-23
Posts: 411

Re: Search engine

I have a problem with TSQLRecordFTS3.
I need change my TSQLResources ans so my TSQLFTSResources definition (add and change field) but I get an error about "virtual table alter".

Is there a way to change my fields using TSQLRecordFTS3?

Offline

#15 2017-04-02 10:24:24

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,182
Website

Re: Search engine

A SQLite3 virtual table cannot be altered, by design.
See https://sqlite.org/fts3.html#summary

As with all virtual table types, it is not possible to create indices or triggers attached to FTS tables. Nor is it possible to use the ALTER TABLE command to add extra columns to FTS tables (although it is possible to use ALTER TABLE to rename an FTS table).

So you may have to recreate the table, if you change its content.

Offline

#16 2022-01-07 03:50:04

htits2008
Member
Registered: 2015-03-25
Posts: 30

Re: Search engine

mpv wrote:

You need to allow custom tokenizer by call to EnableCustomTokenizer, and when execute a statement SELECT fts3_tokenizer during just after connecting to database

Hi,mpv!
Can you share a Complete example ?

Offline

#17 2022-01-07 08:07:40

edwinsn
Member
Registered: 2010-07-02
Posts: 1,215

Re: Search engine

Speaking of SQLite FTS5, what's missing is the `offsets` function, which was provided by FTS3/4...


Delphi XE4 Pro on Windows 7 64bit.
Lazarus trunk built with fpcupdelux on Windows with cross-compile for Linux 64bit.

Offline

#18 2022-01-07 16:50:00

mpv
Member
From: Ukraine
Registered: 2012-03-24
Posts: 1,534
Website

Re: Search engine

htits2008 wrote:

Hi,mpv!
Can you share a Complete example ?

Please, see real life project sources here

Offline

#19 2022-01-10 08:22:45

htits2008
Member
Registered: 2015-03-25
Posts: 30

Re: Search engine

mpv wrote:
htits2008 wrote:

Hi,mpv!
Can you share a Complete example ?

Please, see real life project sources here

Thank's mpv!
I think I find a node.js by freepascal.  you are nice!

Offline

Board footer

Powered by FluxBB