#1 2011-02-28 19:31:53

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,660
Website

BATCH sequences for adding/updating/deleting records

I've added some new TSQLRestClientURI methods to implement BATCH sequences to speed up database modifications: after a call to BatchStart, database modification statements are added to the sequence via BatchAdd/BatchUpdate/BatchDelete, then all statments are sent as once to the remote server via BatchSend - this is MUCH faster than individual calls to Add/Update/Delete in case of a slow remote connection (typically HTTP over Internet).

Normal Add/Update/Delete method calls require a back and forth flow to then from the remote server.
In case of a remote connexion via the Internet (or a slow network), you could have some 100 ms of latency: it's just the "ping" timing, i.e. the time spent for your IP packet to go to the server, then back to you.
If you are making a number of such calls (e.g. add 1000 records), you'll have 100*1000 ms = 100 s = 1:40 min just because of this network latency!

The BATCH sequence allows you to regroup those statements into just ONE remote call.
Internally, it builds a JSON stream, then post this stream at once to the server.
Then the server answer at once, after having performed the modifications.

Here are the new methods in TSQLRestClientURI:

(...)
    {{ begin a BATCH sequence to speed up huge database change
     - each call to normal Add/Update/Delete methods will create a Server request,
       therefore can be slow (e.g. if the remote server has bad ping timing)
     - start a BATCH sequence using this method, then call BatchAdd() BatchUpdate()
       or BatchDelete() methods to make some changes to the database
     - when BatchSend will be called, all the sequence transactions will be sent
       as one to the remote server, i.e. in one URI request
     - if BatchAbort is called instead, all pending BatchAdd/Update/Delete
       transactions will be aborted, i.e. ignored
     - expect one TSQLRecordClass as parameter, which will be used for the whole
       sequence (you can't mix classes in the same BATCH sequence)
     - return TRUE on sucess, FALSE if aTable is incorrect or a previous BATCH
       sequence was already initiated
     - should normally be used inside a Transaction block: there is no automated
       TransactionBegin..Commit/RollBack generated in the BATCH sequence }
    function BatchStart(aTable: TSQLRecordClass): boolean;
    /// create a new member in current BATCH sequence
    // - work in BATCH mode: nothing is sent to the server until BatchSend call
    // - returns the corresponding index in the current BATCH sequence, -1 on error
    // - if SendData is true, content of Value is sent to the server as JSON
    // - if Value is TSQLRecordFTS3, Value.ID is stored to the virtual table
    // - Value class MUST match the TSQLRecordClass used at BatchTransactionBegin
    // - BLOB fields are NEVER transmitted here, even if ForceBlobTransfert=TRUE
    function BatchAdd(Value: TSQLRecord; SendData: boolean): integer;
    /// update a member in current BATCH sequence
    // - work in BATCH mode: nothing is sent to the server until BatchSend call
    // - returns the corresponding index in the current BATCH sequence, -1 on error
    // - Value class MUST match the TSQLRecordClass used at BatchTransactionBegin
    // - BLOB fields are NEVER transmitted here, even if ForceBlobTransfert=TRUE
    function BatchUpdate(Value: TSQLRecord): integer;
    /// delete a member in current BATCH sequence
    // - work in BATCH mode: nothing is sent to the server until BatchSend call
    // - returns the corresponding index in the current BATCH sequence, -1 on error
    // - deleted record class is the TSQLRecordClass used at BatchTransactionBegin
    function BatchDelete(ID: integer): integer;
    /// retrieve the current number of pending transactions in the BATCH sequence
    // - every call to BatchAdd/Update/Delete methods increases this count
    function BatchCount: integer;
    {{ execute a BATCH sequence started by BatchStart method
     - send all pending BatchAdd/Update/Delete statements to the remote server
     - URI is 'ModelRoot/TableName/0' with POST method
     - will return the URI Status value, i.e. 200 OK on success
     - a dynamic array of integers will be created in Results,
       containing all ROWDID created for each BatchAdd call, 200 for all
       successfull BatchUpdate/BatchDelete, or 0 on error
      - any error during server-side process MUST be checked inside Results[]
       (the main URI Status is 200 if no communication error, not if all
       statements in the BATCH sequence was successfull }
    function BatchSend(var Results: TIntegerDynArray): integer;
    {{ abort a BATCH sequence started by BatchStart method
     - in short, nothing is sent to the remote server, and current BATCH sequence
       is closed }
    procedure BatchAbort;
(...)

Since the statements are performed at once, you can't receive the result (e.g. the ID of the added row) on the same time as you append the request to the BATCH sequence.
So you'll have to wait for BatchSend method to retrieve all results, at once, in a dynamic array of integer.

As stated in the above comment, it's also a good idea to use a Transaction for the whole process.

Here is a typical use:

// start the transaction
ClientDist.TransactionBegin(TSQLRecordPeople);
try
  // start the BATCH sequence
  Check(ClientDist.BatchStart(TSQLRecordPeople));
  // delete some elements
  for i := 0 to n-1 do
    Check(ClientDist.BatchDelete(IntArray[i])=i);
  // update some elements
  nupd := 0;
  for i := 0 to aStatic.Count-1 do
  if i and 7<>0 then begin // not yet deleted in BATCH mode
    Check(ClientDist.Retrieve(aStatic.ID[i],V));
    V.YearOfBirth := 1800+nupd;
    Check(ClientDist.BatchUpdate(V)=nupd+n);
    inc(nupd);
  end;
  // add some elements
  V.LastName := 'New';
  for i := 0 to 1000 do begin
    V.FirstName := RandomUTF8(10);
    V.YearOfBirth := i+1000;
    Check(ClientDist.BatchAdd(V,true)=n+nupd+i);
  end;
  // send the BATCH sequences to the server
  Check(ClientDist.BatchSend(Results)=200);
  // now Results[] contains the results of every BATCH statement...
  Check(Length(Results)=n+nupd+1001);
  // Results[0] to Results[n-1] should be 200 = deletion OK
  // Results[n] to Results[n+nupd-1] should be 200 = update OK
  // Results[n+nupd] to Results[high(Results)] are the IDs of each added record
  for i := 0 to n-1 do
    Check(not ClientDist.Retrieve(IntArray[i],V),'BatchDelete');
    for i := 0 to high(Results) do
      if i<nupd+n then
        Check(Results[i]=200) else begin
        Check(Results[i]>0);
        ndx := aStatic.IDToIndex(Results[i]);
        Check(ndx>=0);
        with TSQLRecordPeople(aStatic.Items[ndx]) do begin
          Check(LastName='New','BatchAdd');
          Check(YearOfBirth=1000+i-nupd-n);
         end;
       end;
  // in case of success, apply the Transaction
  ClientDist.Commit;
except
  // In case of error, Rollback the Transaction
  ClientDist.RollBack;
end;

Here is a typical JSON stream sent to the server:

{"People":["DELETE":2,"DELETE":13,"DELETE":24,   
   (...)  all DELETE actions
  ,"DELETE":11010,
  "PUT":{"RowID":3,"FirstName":"Sergei1","LastName":"Rachmaninoff","YearOfBirth":1800,"YearOfDeath":1943},
  "PUT":{"RowID":4,"FirstName":"Alexandre1","LastName":"Dumas","YearOfBirth":1801,"YearOfDeath":1870},
   (...)  all PUT = update actions
  "PUT":{"RowID":11012,"FirstName":"Leonard","LastName":"da Vinçi","YearOfBirth":9025,"YearOfDeath":1519},
  "POST":{"FirstName":"‚@•Å"H†m£ g","LastName":"New","YearOfBirth":1000,"YearOfDeath":1519},
  "POST":{"FirstName":"@…,KA½à#¶f","LastName":"New","YearOfBirth":1001,"YearOfDeath":1519},
   (...)  all POST = add actions
  "POST":{"FirstName":"+tqCXW3Â\"","LastName":"New","YearOfBirth":2000,"YearOfDeath":1519}
  ]}

Here is a typical JSON stream receiver from the server, on success:

[200,200,...]

All the JSON generation (client-side) and parsing (server-side) is very optimized and very fast.
All timing is now spent into the database engine itself.

Available in the source code repository.
See http://synopse.info/fossil/info/9571795732424598a

Offline

#2 2011-03-27 10:21:44

edwinsn
Member
Registered: 2010-07-02
Posts: 1,218

Re: BATCH sequences for adding/updating/deleting records

This is just GREAT!


Delphi XE4 Pro on Windows 7 64bit.
Lazarus trunk built with fpcupdelux on Windows with cross-compile for Linux 64bit.

Offline

#3 2011-12-14 14:43:32

array81
Member
From: Italy
Registered: 2010-07-23
Posts: 411

Re: BATCH sequences for adding/updating/deleting records

So there is not way to use this functions with blob fields? Do I need use "classic" way?

Offline

#4 2011-12-14 15:01:42

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,660
Website

Re: BATCH sequences for adding/updating/deleting records

For BLOBs, you'll have to use the "classic" way.

Batch sequences uses JSON serialization, so BLOBs will be serialized as Base64, which is not so optimal.

Offline

#5 2011-12-14 15:19:17

array81
Member
From: Italy
Registered: 2010-07-23
Posts: 411

Re: BATCH sequences for adding/updating/deleting records

OK, thanks

Offline

#6 2012-04-18 15:10:44

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,660
Website

Re: BATCH sequences for adding/updating/deleting records

Now TSQLRestClientURI BATCH sequences methods will allow to mix several TSQLRecord in its sequence of BatchAdd/BatchUpdate/BatchDelete calls - if initiated as BatchStart(nil).

See http://synopse.info/fossil/info/6c97b80ed9

Offline

Board footer

Powered by FluxBB