Master/Slave replications - tips :)

hnb · 2016-09-30 08:11:57

I'd like to start new topic, maybe it will help someone. The discussion starts here:

http://synopse.info/forum/viewtopic.php … 890#p21890

My application has almost full functionality when is offline (except printing - for that server is needed). I made my own replication master/slave which is fully functional in offline mode (each slave can add or modify records even in offline). Each table (lets call normal table "main table") on client side has one or more special "cache" tables (each of "cache" table has exactly the same columns - technically each "cache" table is descendant of main table). I have my own extensions for mORMot for that. On the server side each cache table is simple converted into original table (and each of record from "cache" table is converted and sent back as record for "main table" to client, the process is monitored/protected so even when I lose connection at critical moment of sync all is tracked in proper way thanks to unique GUID for each action). The core code for improved master/slave replication (rejected by ab ):

uses
  mORMot, SynCommons, SysUtils;

type
  TSQLRestBatchHelper = class helper for TSQLRestBatch
  public
    function Add(Value: TSQLRecord; SendData: boolean; ForceRecordClass: TSQLRecordClass; ForceID: boolean=false;
      const CustomFields: TSQLFieldBits=[]; DoNotAutoComputeFields: boolean=false): integer; overload;
  end;

  { TSQLRestHelper }

  TSQLRestHelper = class helper for TSQLRest
  public
    function BatchSend(Batch: TSQLRestBatch; const Data: RawUTF8; var Results: TIDDynArray): integer; overload;
    function BatchSend(Batch: TSQLRestBatch; const Data: RawUTF8): integer; overload;
    function Update(Value: TSQLRecord; ForceRecordClass: TSQLRecordClass;
      const CustomFields: TSQLFieldBits=[]; DoNotAutoComputeFields: boolean=false): boolean; overload;
  end;

implementation

{ TSQLRestHelper }

function TSQLRestHelper.BatchSend(Batch: TSQLRestBatch; const Data: RawUTF8;
  var Results: TIDDynArray): integer;
var
  LData: RawUTF8;
begin
  LData := Copy(Data, 1);
  try
    if Batch <> nil then
      result := EngineBatchSend(Batch.Table,LData,Results,Batch.Count)
    else
      result := EngineBatchSend(nil,LData,Results,0)
  except
    on Exception do // e.g. from TSQLRestServer.EngineBatchSend()
      result := HTTP_SERVERERROR;
  end;
end;

function TSQLRestHelper.BatchSend(Batch: TSQLRestBatch; const Data: RawUTF8
  ): integer;
var
  Results: TIDDynArray;
begin
  Result := BatchSend(Batch,Data,Results);
end;

function TSQLRestHelper.Update(Value: TSQLRecord;
  ForceRecordClass: TSQLRecordClass; const CustomFields: TSQLFieldBits;
  DoNotAutoComputeFields: boolean): boolean;
var
  DefaultClass: TSQLRecordClass;
begin
  Result := False;

  if (self=nil) or (Value=nil) or (ForceRecordClass=nil) then
    exit;

  if not PSQLRecordClass(Value)^.InheritsFrom(ForceRecordClass) then
    exit;

  DefaultClass := PSQLRecordClass(Value)^;
  PSQLRecordClass(Value)^ := ForceRecordClass;
  result := Self.Update(Value, CustomFields, DoNotAutoComputeFields);
  PSQLRecordClass(Value)^ := DefaultClass;
end;

{ TSQLRestBatchHelper }

function TSQLRestBatchHelper.Add(Value: TSQLRecord; SendData: boolean;
  ForceRecordClass: TSQLRecordClass; ForceID: boolean;
  const CustomFields: TSQLFieldBits; DoNotAutoComputeFields: boolean): integer;
var
  DefaultClass: TSQLRecordClass;
begin
  result := -1;
  if (self=nil) or (Value=nil) or (fBatch=nil) or (ForceRecordClass=nil) then
    exit;

  if not PSQLRecordClass(Value)^.InheritsFrom(ForceRecordClass) then
    exit;

  DefaultClass := PSQLRecordClass(Value)^;
  PSQLRecordClass(Value)^ := ForceRecordClass;
  result := Self.Add(Value, SendData, ForceID, CustomFields, DoNotAutoComputeFields);
  PSQLRecordClass(Value)^ := DefaultClass;
end;

The most important parameter is "ForceRecordClass" (to convert cache table into main table). With a little of work my mechanism can be automated for any ORM master/slave replication. Behind scenes is extensively used batching. I have also additional batch parser to extract records from batch like "PUT@table_name,{...}". The topic is complicated but the end effect is amazing. The client/slave can be offline even for very long time (few days?) and all is synchronized very well, thanks to "cache" tables idea. In addition should be possible to manage politics for conflicts (for now I have one rule: "first client wins").

The key code to start any work is presented above.

AOG · 2016-10-04 12:21:57

Hi Maciej,

I would be very very interested in these offline features and auto-sync.

In my case, a lot of data-acquisition systems are running with the mORMot. They store local and must sync remote.
Any intelligence to handle this (without me reinventing the wheel) would be a very welcome addition.

And for Android, it is nearly a natural must-have-feature.

The core code for improved master/slave replication (rejected by ab)

Is this a redundant feature, already available ? Or any other reason ?? Any hint would be welcome.

edwinsn · 2016-10-05 03:53:54

Thanks for sharing! I'm also very interested in this feature, I am wondering why @ab rejected it.

Such offline feature is especially useful for a Point of Sales system, where the store front end is connected to the server, but we have to ensure the POS will keep working when it's disconnected from the server.

Just my 2 cents.

hnb · 2016-10-05 06:50:31

Comment from ab: http://synopse.info/forum/viewtopic.php … 877#p17877

so my "two way" master/slave replication with offline functionality is developed outside mORMot sources . Anyway code attached above is the key to start any serious work

AOG · 2016-10-05 10:10:41

Anyway, thanks for sharing your code Maciej !

oz · 2016-10-05 11:26:35

After dealing with replication/synchronization scenarios for 20+ years now, there is one thing i could say and advice to anyone from my experiences:
This kind of functionality is very very domain specific and depends hell of a lot on your use case. If you have to support such offline read/write services in your application, then don't start implementing this functionality in your DB or ORM layer. Do it in your business- or application layers. Don't think in tables, think in objects.
Replication at DB/ORM layer is painfull based on my experiences. Of course you can do that, but sooner or later you will face nasty little problems which will require you to do dirty code hacking, polluting any domain model.
Furthermore mORMot is not a very good candidate for such offline replication services at DB/ORM layers because of it's numerical ID design imho. But from my point of view this is not a limitation because:
1. It's a not a good idea to start building sync/replication services at DB/ORM layers.
2. mORMot is absolutely great about it's DDD capatibilities, and thats the rigth place to start building sync services at all.

I see the possibility to create some kind of generic object based sync/replication services for various scenarios with the help of mORMot's DDD toolbox, but thats a huge and difficult task to do.

@hnb:
I have some questions about your solution. Your model is 1 Master, N Slaves, right? If so, then how do you deal with ID conflicts? Do you use some kind of Master/Slave ID Converter, do you use fixed size ID ranges (e.g: Slave 1: 0-1m, Slave 2: 1m-2m,...), or do you use some kind of custom id handling based on guids or other non numerical field types?

ab · 2016-10-05 19:23:05

Do not forget about the TSynUniqueIDGenerator, which allows to create increasing numerical IDs for a whole system, with each node having its own IDs.
It helps to synch and replicate several uncoupled DB into one bigger.

hnb · 2016-10-06 11:39:18

@ab: TSynUniqueIdentifierGenerator might be good for new projects but this solution is not perfect (for example is impossible to use TSynUniqueIdentifierGenerator for large existing systems).

@oz: I have 1 Master, N Slaves for described scenario (but in background I have much more complicated architecture with N Masters but that is managed outside mORMot by other tools). The idea for "1 Master - N Slaves" is simple and can be improved by ORM system. What we have:

* android clients (with local SQLite3 DB)
* 1 server mORMot server
* 1 PostgreSQL DB
* Existing big project developed for more than 10 years with 500+ tables (so I can't start from scratch - structures like TSynUniqueIdentifierGenerator are useless for me).

The first step is model:

  TSQLmaster = class(TSQLRecord)
    // ...
  end;
  TCACHE_master = class(TSQLmaster) // could be generated by ORM
  protected 
    fiid: RawUTF8;
    fcreation_time: TDateTime;
  published  
    property iid: RawUTF8 index 38 read fiid write fiid;
    property creation_time: TDateTime read fcreation_time write fcreation_time;
  end; 

  TSQLdetail = class(TSQLRecord)
    // ...
  end;
  TCACHE_detail = class(TSQLdetail) // could be generated by ORM
  protected 
    fiid: RawUTF8;
    fcreation_time: TDateTime;
  published  
    property iid: RawUTF8 index 38 read fiid write fiid;
    property creation_time: TDateTime read fcreation_time write fcreation_time;
  end;
  TCACHE_CACHE_detail = class(TCACHE_detail); // could be generated by ORM

In presented model we have 5 tables: master, CACHE_master, detail, CACHE_detail, CACHE_CACHE_detail
How it is handled by client? For the client, insertion for new records into master/detail table is forbidden. We have simple schema:

* insert for new detail record for correlated master record means hidden insertion into CACHE_detail table
* insert for new master record means hidden insertion into CACHE_master table
* insert for new detail record for correlated CACHE_master record means hidden insertion into CACHE_CACHE_detail table

Each insert above means also assigning for iid and creation_time:

  newrecord.iid := GUIDToRawUTF8(RandomGUID);
  newrecord.creation_time := DBServerDateTime;

thanks this solution I can receive new data from server without problem and I don't need to send records from CACHE_* tables in the same time (but that scenario is also possible .

When client like to send data (or few parts) I have special record for that purpose:

  TSYNC_CACHE = class(TSQLRecord)  
  protected 
    fiid: RawUTF8;
    fcreation_time: TDateTime;
    { ... 
      structures (for example dynamic arrays) to handle CACHE_* tables and batching 
    }
  published  
    property iid: RawUTF8 index 38 read fiid write fiid;
    property creation_time: TDateTime read fcreation_time write fcreation_time;
    { ...
      structures (for example dynamic arrays) to handle CACHE_* tables and batching }
    }
  end;

on server side I have local SQLite3 file DB to track all iid which is the key for proper working, in some cases mORMot can send twice (or even more) the same data, or even corrupted data but this is different topic. Anyway TSYNC_CACHE and each item inside needs iid. When server has done all insertions described in TSYNC_CACHE (with insertions into iid table) then server sends back TSYNC_CACHE to client but with new correct ID and optionally with new values for some columns for each record inside TSYNC_CACHE.

Also very important for proper ID is right configuration for regular tables on Server side:

  procedure AddEngineComputeId(AClass: TSQLRecordClass);
  begin
    LRestStorage := TSQLRestStorageExternal.Instance(AClass, AServer);
    LRestStorage.EngineAddForcedID := 0;
    LRestStorage.OnEngineAddComputeID := aRestServer.OnEngineAddComputeID; // here inside OnEngineAddComputeID is called nextval for postgreSQL sequence
  end;

  procedure AddEngineIgnoreId(AClass: TSQLRecordClass); // for simple insert
  begin
    LRestStorage := TSQLRestStorageExternal.Instance(AClass, AServer);
    LRestStorage.EngineAddForcedID := 1;
  end;

thanks to mORMot and metaclasses, manipulations between CACHE_* and original tables on client/server side is easy. All described above can be handled on ORM level in very elastic way with many policies. The topic is complex and is very hard to decribe many aspects in one post. Thanks to CACHE_* tables, batching and iid table (and mORMot of course) I have done full CRUD + master/slave replication + online/offline modes

warleyalex · 2016-10-10 21:31:12

I just had a closer look at this post

It seems there is some interesting concepts about sync that I think could be applied to.

#001. Each client (the mobile app) must decide when to sync with the server, e.g. when a user clicks "DO_SYNC Refresh". The app will sync by calling a method DO_SYNC(). This method will call syncFromServer() first and then syncToServer(). The app will always get new data FROM the server first and then send new data TO the server.

#002. The syncFROMServer() method on the client, will ask the server for all data changes since servercounter_lastsync. The server will answer this request by fetching all objects with object.counter_lastupdate > servercounter_lastsync and sending them to the client. The server will also send its current counter to the client.
The client will store this new value in servercounter_lastsync (because this is the server counter for which the
client has received all data changes).

Example Sync objects of type "booking" FROM server TO client (DOWNLOAD):
GET http://127.0.0.1/root/booking/list/12345

12345 = client.SERVERCOUNTER_LASTSYNC (counter value on server for start of sync)

JSON returned FROM server TO client:

{
	objects : [{
			guid : 65E5B3EB-72B3-4768-B86D-B0D07DF81C3E,
			pk : 2014-05-10,
			name : apples,
			value : 9,
			timestamp_created : 2014-03-20 13:15:00,
			timestamp_lastupdate : 2014-03-20 13:15:00
		}, {
			guid : 77310491-2B68-4EF2-AD8A-5DA75D0BB168,
			pk : 2014-0-12,
			name : oranges,
			value : 3,
			timestamp_created : 2014-03-20 12:18:08,
			timestamp_lastupdate : 2014-03-20 15:17:23
		}, {
			guid : 93F58F2C-8837-4CE3-A3DB-CA0AD01632B7,
			pk : 2014-07-18,
			name : cherries,
			value : 6,
			timestamp_created : 2014-03-20 13:16:32,
			timestamp_lastupdate : 2014-03-20 15:17:58
		}
	],
	result : {
		statuscode : OK,
		SERVERCOUNTER : 453
	}
}

#003. The syncTOServer() on the client, will send all its data changes TO the server since COUNTER_LASTSYNC.
The client will fetch its objects with object.counter_lastupdate > COUNTER_LASTSYNC and send them to the server.
When the server confirms have received the data (result = ok), the client will update its COUNTER_LASTSYNC with
the current value of its counter.

Example Sync objects of type "booking" FROM client TO server (UPLOAD):

POST http://127.0.0.1/root/booking/update

JSON in POST FROM client TO server:

{
	objects : [{
			guid : 65E5B3EB-72B3-4768-B86D-B0D07DF81C3E,
			pk : 2014-05-11,
			name : apples,
			value : 9,
			timestamp_created : 2014-03-20 13:15:00,
			timestamp_lastupdate : 2014-03-20 13:15:00
		}, {
			guid : 77310491-2B68-4EF2-AD8A-5DA75D0BB168,
			pk : 2014-06-12,
			name : oranges,
			value : 7,
			timestamp_created : 2014-03-20 12:18:08,
			timestamp_lastupdate : 2014-03-20 15:23:25
		}, {
			guid : 93F58F2C-8837-4CE3-A3DB-CA0AD01632B7,
			pk : 2014-07-18,
			name : cherries,
			value : 9,
			timestamp_created : 2014-03-20 13:16:32,
			timestamp_lastupdate : 2014-03-20 15:19:43
		}
	]
}

JSON returned FROM server TO client : 
{
	result : {
		statuscode : OK,
		SERVERCOUNTER : 456
	}
}

The idea of EACH DEVICE (client) will keep track of its sync status using these 2 parameters:
• COUNTER_LASTSYNC: value of the local counter of the client on the last sync TO the server (upload)
• SERVERCOUNTER_LASTSYNC: value of the counter on the server, on the last sync FROM the server (download)

I wish I could implement the Sync using this approach.

oz · 2016-10-11 12:20:18

warleyalex wrote:

I just had a closer look at this post
It seems there is some interesting concepts about sync that I think could be applied to.
...

I just had a quick look at the article...
This looks like an interesting start point for building full offline aware master/slave replication. The basic concept is similiar to what we are doing over here (in a legacy java application) for years.
The "difficult" part would be to implement ID clash logic regarding foreign keys for tables having any kind of relations (FK constraints).

I'd recommend NOT to add "row_guid", "row_timestamp" and "sentToServerOK" for every single table participating in replication, but to introduce ONE single new table which in fact would act as a changelog.

SyncChangeLog
  ID 
  TableName/TableIndex
  RowID
  RowTimestamp
  ActionFlag (insert/update/delete)
  SentToServerOk (only used on slave-side)

Every single ORM write operation (insert/update/delete) results in adding a new row to this table (if there's no entry with same TableName/Index and ID properties already available). I think it should be possible to use/extend mORMot's audit-trail features for this.

The great benefit of using a such a single "changelog" is that it requires only one single SELECT statement to get/send all ORM modifications from/to master. Furthermore there is no need to care about referential integrity for INSERT statement order, because sorting by TSQLRecordSyncChangelog.ID (autoinc) will always result in the right SQL statement order. This means that the sync mechanism is uncoupled from the DB Model, which makes it kind of generic. This whole thing could also easily be extended to handle syncronization of non-DB objects (like pictures,uploads,whatever...).

Just my two cents for discussion...

mORMot Open Source

#1 2016-09-30 08:11:57

Master/Slave replications - tips :)

#2 2016-10-04 12:21:57

Re: Master/Slave replications - tips :)

#3 2016-10-05 03:53:54

Re: Master/Slave replications - tips :)

#4 2016-10-05 06:50:31

Re: Master/Slave replications - tips :)

#5 2016-10-05 10:10:41

Re: Master/Slave replications - tips :)

#6 2016-10-05 11:26:35

Re: Master/Slave replications - tips :)

#7 2016-10-05 19:23:05

Re: Master/Slave replications - tips :)

#8 2016-10-06 11:39:18

Re: Master/Slave replications - tips :)

#9 2016-10-10 21:31:12

Re: Master/Slave replications - tips :)

#10 2016-10-11 12:20:18

Re: Master/Slave replications - tips :)

Board footer