SELECT fts3_tokenizer('mytokenizer', @tokenizerFunction)
See [1d09ea3191]
]]>Including FTS3 virtual tables was not so simple.
One big issue is that I mapped the rowid to the ID property for normal tables.
And it's not possible to have this ID property for a FTS3 virtual table.
So I had to use RowID as property name for all SQL statements... therefore some deep changes where needed in the SQL generation code of the framework.
And some other parts of the code, which are checking the field names before sending them to the database engine, needed to be rewritten too.
Including the TSQLRawFTS3 feature, as I posted above, could be a lot more work than I expected first.
]]>Descending TSQLRecordFTS3 from TSQLRecord doesn't quite seem right. Shouldn't a parent object be distilled from them to make them siblings, otherwise a lot of functions would be overridden (not sure if this is an issue)?
We need all inherited fields from TSQLRecord, and the fact that it inherits from it make it easy register any TSQLRecordFTS3 as other tables.
And no function is overriden, if you look at the type definition: only a new one - about optimizing FTS3 table - is added.
So it looks like a good solution, at least from the code point of view, to me.
IMHO it's the less demanding option in terms of number of lines.
Would it make sense is to have an 'umbrella' object for tables that use both freetext and non-freetext fields?
What do you call an "umbrella" object? An interface? Looking back at the Architecture of the framework, I don't think so.
The TSQLRawFTS3 new type definition seems coherent from the user point of view. Just some UTF-8 field, with extended search capabilities when used with the "MATCH" operator in the where clause.
Descending TSQLRecordFTS3 from TSQLRecord doesn't quite seem right. Shouldn't a parent object be distilled from them to make them siblings, otherwise a lot of functions would be overridden (not sure if this is an issue)?
With the queries from my experience there normally needs to be some intervention in the 'n-tier' to make the search interface simple enough for the end user, so as you say it's not an issue.
Would it make sense is to have an 'umbrella' object for tables that use both freetext and non-freetext fields?
]]>But in all cases, it's a need to have a TSQLRecordFTS3 table type, in order to implement a native access to the FTS3 virtual table in a RESTful manner.
For example, if there is no other field types than 'fts text' (as you stated above), then it's easy to use a TSQLRecordFTS3 instead of a TSQLRecord.
What I can do is that I can add a new field type, just like TSQLRawBlob.
type
TSQLRawFTS3 = type TSQLRawUTF8;
This field kind won't be retrieved by default (as BLOB is not).
It makes sense because this field content is intended to contains a lot of text, which could use a lot of bandwidth in an ORM approach (we populate all field values from server).
Then you'll use new RetrieveFTS3() and UpdateFTS3() methods to access to this field from its ID.
From the client side, they will silently call RetrieveBlob() and UpdateBlob() methods, via a RESTful standard extended URL, like the one used for BLOB fields: http://server/ModelName/TableName/FTS3FieldName
From the server side, the low-level Blob functions won't be used, but the corresponding FTS3 table will be used instead.
For the queries, we will able to access to the fts3 fields only, or to not-fts3 fields only, on the same request. That is, it is impossible, without writing a full SQL parser in Delphi, to mix fts3 fields and no-fts3 fields in the "where" clause. But it's not an issue, if it's properly documented.
One issue is that I can't use such an hidden TSQLRecordFTS3 table to handle the storage, because the retrieval rely on fields defined via RTTI, which is not the case here.
So there'll be an hidden dynamic array containing the fts3 table names, according to the field type definition.
What do you think about this implementation?
]]>Re the new fts type do you think it could be worth extending TSQLRecord instead of descending from it? What I'm thinking is that there could be a new field type of 'fts text'. When this field type is used in a TSQLRecord child definition behind the scenes two tables would be created and both kept in step. Or if no other field types are used then a single fts3 table would be created. I guess this would be harder to implement in the framework but could make using it simpler.
TSQLRecord do you think
]]>It will close ALL statements. And that's not a good idea!
If you've nested statements - which can occur if you've prepared a statement, like we did for Retrieve() - you'll close ALL statements...
So your code will break the framework.
I'm working of having FTS3 handled natively in the framework.
Here is the interface of the new type:
type
{{ a base record, corresponding to a FTS3 table
- FTS3 table is an SQLite virtual table that allows users to perform
full-text searches on a set of documents. The most common (and effective)
way to describe full-text searches is "what Google, Yahoo and Altavista do
with documents placed on the World Wide Web". Users input a term, or
series of terms, perhaps connected by a binary operator or grouped together
into a phrase, and the full-text query system finds the set of documents
that best matches those terms considering the operators and groupings the
user has specified. See @http://sqlite.org/fts3.html
- any record which inherits from this class must have only sftUTF8Text
(RawUTF8) fields
- this record has its fID: integer property which may be published
as DocID, to be consistent with SQLite3 praxis, and reflect that it
points to an ID of another associated TSQLRecord
- a good approach is to store your data in a regular TSQLRecord table, then
store your text content in a separated FTS3 table, associated to this
TSQLRecord table via its ID/DocID
- the ID/DocID property can be set when the record is added, to retrieve any
associated TSQLRecord (note that for a TSQLRecord record,
the ID property can't be set at adding, but is calculated by the engine)
- static tables don't handle TSQLRecordFTS3 classes
- by default, the FTS3 engine ignore all characters >= #80, but handle
low-level case insentivity (i.e. 'A'..'Z') so you must keep your
request with the same range for upper case
- by default, the "simple" tokenizer is used, but you can inherits from
TSQLRecordFTS3Potter class if you want a better English matching, using
the Porter Stemming algorithm - see @http://sqlite.org/fts3.html#tokenizer
- in order to make FTS3 queries, use regular queries, with the MATCH operator:
! var IDs: TIntegerDynArray;
! if OneFieldValues(TSQLMyFTS3Table,'ID','text MATCH "linu*"',IDs) then
! // you've all matching IDs in IDs[] }
TSQLRecordFTS3 = class(TSQLRecord)
public
{{ optimize the FTS3 virtual table
- this causes FTS3 to merge all existing index b-trees into a single large
b-tree containing the entire index. This can be an expensive operation,
but may speed up future queries. See @http://sqlite.org/fts3.html#section_1_2
- this method must be called server-side
- returns TRUE on success }
class function OptimizeFTS3Index(Server: TSQLRestServer): boolean;
end;
/// this base class will create a FTS3 table using the Porter Stemming algorithm
// - see @http://sqlite.org/fts3.html#tokenizer
TSQLRecordFTS3Potter = class(TSQLRecordFTS3);
I think this could help you using FTS3.
]]>procedure TSQLRequest.Close;
begin
if Request=0 then exit;
repeat
sqlite3_finalize(fRequest);
fRequest := sqlite3_next_stmt(RequestDB,0);
if fRequest = 0 then break;
until not (sqlite3_finalize(fRequest) in [SQLITE_OK,SQLITE_ABORT]);
end;
I would be great if a Free text searching option could be incorporated into your ORM framework.
Indeed, but since I didn't use FTS3 myself, I didn't put dedicated methods yet.
I'll try to add some methods for using FTS3, to avoid writing SQL statements.
Also seem to have found a problem with sample03 (pipes) - The server part can't find unit2.pas
These files were indeed missing the in the "03 - NamedPipe Client-Server" sample.
I've uploaded to http://synopse.info/fossil
And they will be integrated to the next release (1.9).
SQL := FormatUTF8('INSERT INTO fts3index(title, notes) VALUES(''%'', ''%'');',[NameEdit.Text,QuestionMemo.Text]);
About your problem.
I took a look at the source code:
destructor TSQLDataBase.Destroy;
var S: TSQLHandle;
begin
if DB<>0 then
try
Rollback; // any unfinished transaction is rollbacked
finally
{ Applications should finalize all prepared statements and close all BLOB handles
associated with the sqlite3 object prior to attempting to close the object }
repeat
S := sqlite3_next_stmt(DB,0); // 0: get first prepared statement for DB
if S=0 then
break;
// if code was correctly protected with try/finally, as in
// TSQLDataBase.Execute() and TSQLRequest.Execute(), we should never go here
assert(false,FileName); // debug purpose, never trigered on production
until not (sqlite3_finalize(S) in [SQLITE_OK,SQLITE_ABORT]);
// close the database object
sqlite3_close(DB);
fDB := 0;
end;
DeleteCriticalSection(fLock);
fCache.Free;
inherited;
end;
As it states, there is a statement which has not been closed.
Since it shouldn't appear if you use the framework right, I've an aggressive Assert() here.
My ExecuteAll methods just looks fine:
procedure TSQLRequest.ExecuteAll;
begin
if RequestDB=0 then
raise ESQLException.Create(0,SQLITE_CANTOPEN);
try
repeat
repeat
until Step<>SQLITE_ROW; // all steps of this statement
until PrepareNext=SQLITE_DONE; // all statements
finally
Close; // always release statement
end;
end;
And the ExecuteJSON() calls this code:
procedure TSQLRequest.Execute(aDB: TSQLHandle; const aSQL: RawUTF8; JSON: TStream;
Expand: boolean=false);
...
try
// prepare the SQL request
if aSQL<>'' then // if not already prepared, reset and bound by caller
Prepare(aDB,aSQL); // will raise an ESQLException on error
...
finally
if aSQL<>'' then
Close; // always release statement (if not prepared and closed by caller)
W.Free;
end;
end;
So I guess both methods are implemented correctly.
If you can debug your code in step-by-step,
There could be some not closed statement when using FTS3. Error should be in the SQLite3 engine itself.
I think you can just comment the following line in SQLite3.pas:
// assert(false,FileName); // debug purpose, never trigered on production
It's safe in all cases.
]]>Adding the following before Application.run; works fine:
TSQLRestServerDB(Form1.Database).DB.GetTableNames(TableNamesAtCreation);
if FindRawUTF8(TableNamesAtCreation, 'fts3index') = -1 then
TSQLRestServerDB(Form1.Database).EngineExecuteAll(FormatUTF8(
'CREATE VIRTUAL TABLE fts3index USING fts3(%, %);',['title', 'notes']));
The problem is if I change the add and find buttons onclick event as below I get an assertion failure at line1935 of SQLite3.pas when closing the app. Can you suggest what I'm doing wrong?
procedure TForm1.AddButtonClick(Sender: TObject);
var
SQL: rawUTF8;
begin
SQL := FormatUTF8('INSERT INTO fts3index(title, notes) VALUES(''%'', ''%'');',
[StringToUTF8(NameEdit.Text), StringToUTF8(QuestionMemo.Text)]);
TSQLRestServerDB(Database).EngineExecuteAll(SQL);
end;
procedure TForm1.FindButtonClick(Sender: TObject);
begin
questionmemo.Text := TSQLRestServerDB(Database).DB.ExecuteJSON(
'SELECT * FROM fts3index WHERE fts3index MATCH '''+NAMEEDIT.Text+''';');
end;
I would be great if a Free text searching option could be incorporated into your ORM framework.
Also seem to have found a problem with sample03 (pipes) - The server part can't find unit2.pas
Thanks
]]>