Replication, transactions, consistency

Leslie7 · 2015-07-04 13:01:01

1. I am wondering if the slave databases are always in a consistent state? Is it possible that a previously committed transaction on the server effecting several tables is only partially replicated yet when a query request comes in for the slave from its client.

2. How does locking work while replicating? Does it require a longer write lock when there is a lot of new data which may starve readers?

Last edited by Leslie7 (2015-07-04 16:57:16)

ab · 2015-07-04 13:22:55

1. Replication follows ORM level calls, so regular transactions are not followed (e.g. a Rollback won't be rollbacked).
The idea is not to use regular transactions, but a TSQLRestBatch (with its own automatic transaction feature, for performance).

2. Replication is done at ORM level, but in an asynchronous way, so is not impacted by the ORM locking features.

Leslie7 · 2015-07-04 15:10:38

I am reading the docs about this now. But the original questions remain:

1. does the replication as currently implemented guarantee that the slaves are in a consistent state at any given moment when its data can be accessed?
2. if yes, are there any potential performance issues to be aware of? Eg like longer write locks?

These are important questions when designing the architecture.

ab · 2015-07-04 15:29:39

Each slave would be in a consistent state for each table, if you use TSQLRestBatch for writing all data on the master, within a transaction.
But several tables may not be in a consistent slave, in some border cases, theoretically.

Potential issues may appear when slaves have a lot of data to synch (e.g. at first connection), since it may block write operations.
It uses chunks for this initial synch process, so you could customize the number of rows to retrieve in each chunk.

Leslie7 · 2015-07-04 16:28:09

This is what I suspected and I think this is a serious issue.

Maybe there is a more secure and balanced way to go about this: Instead of versioning the records it is better to make it possible to identify the Transactions ( or Unit Of Work ... whatever makes the changes atomic). Every record changed by a transaction receives the same Transaction_ID. It makes incremental replication possible. It could go one by one. Or for better performance: for one replication "transaction" only as many regular transactions are selected that it can be read and written within a reasonable time. The whole replication may take longer, but ensures atomic behavior during replication and responsiveness for both master and slaves. It seems reasonable for me to combine it with audit logging. The same history could be used for both. It is much more efficient to calculate the size and content of the replication transactions in advance as the change log is being created. When the replication occurs every step is readily available.

edwinsn · 2018-04-09 15:11:22

As far as I know, the Raft protocol (https://raft.github.io/) is the current mechanism for doing replication/distribution.

Two open source projects using the raft spec:
dqlite: https://github.com/CanonicalLtd/dqlite
rqlite: https://github.com/rqlite/rqlite

an interactive, graphical illustration of the raft consensus protocol:
http://thesecretlivesofdata.com/raft/

Last edited by edwinsn (2018-04-09 15:13:53)

ab · 2018-04-09 21:18:24

Yes, Raft is complete, but was never meant for performance...
On a local network, it reaches less than 100 insertions per seconds...
This is clearly something else than our Master/Slave replication.

I've found the following implementation interesting: https://github.com/postgrespro/raft

And please read again my post above: I wrote that "It uses chunks for this initial synch process" so the write locks are minimal.

edwinsn · 2018-04-10 01:49:44

@ab, I didn't know about the performance part. I posted these links because I was searching something on the forum and found this post, so I thought I'd post them here. As a matter of fact I didn't study the Raft details, just heard something about it.

ab · 2018-04-10 12:02:10

Raft is a very good protocol, and worth investigating indeed.
If we implemented something like such decentralized replication, we should follow Raft specification, since most of the border cases are properly handled by it.

Thanks for the input, anyway.

mORMot Open Source

#1 2015-07-04 13:01:01

Replication, transactions, consistency

#2 2015-07-04 13:22:55

Re: Replication, transactions, consistency

#3 2015-07-04 15:10:38

Re: Replication, transactions, consistency

#4 2015-07-04 15:29:39

Re: Replication, transactions, consistency

#5 2015-07-04 16:28:09

Re: Replication, transactions, consistency

#6 2018-04-09 15:11:22

Re: Replication, transactions, consistency

#7 2018-04-09 21:18:24

Re: Replication, transactions, consistency

#8 2018-04-10 01:49:44

Re: Replication, transactions, consistency

#9 2018-04-10 12:02:10

Re: Replication, transactions, consistency

Board footer