SynDBPostgres - implementation of TSQLDB* for Postgres using libpg

mpv · 2020-03-18 17:41:36

Waiting for review in @287

Can be used as direct replacement of SynDBZeos to access a Postgres database for FPC users. Not so powerful as Zeos but do all things we need in mORMot.

My idea is to participate in Techempower benchmarks with mORMot2. We need as-fast-as-possible Postgres driver for this

@ab - have we some regression test for Posgres DB? (I do not found it)

Last edited by mpv (2020-03-18 17:42:00)

ab · 2020-03-19 09:42:22

Tests are in sample 15.
See https://github.com/synopse/mORMot/tree/ … erformance

I will merge and make some minor changes to the unit.
I guess it could be easy to make a Delphi/FPC compatible unit, if you provide some testing.

mpv · 2020-03-19 18:31:26

Please, merge small fixes #289 . After this my regression tests is OK (~15000 different queries in mult-ithread mode)

ab · 2020-03-19 19:06:07

I have made https://github.com/synopse/mORMot/commi … cdda2d685b and https://github.com/synopse/mORMot/commi … 34d834d902 instead.

I also fixed some potential problems with escaping (e.g. backslashs).

What about the performance?
Especially in comparison to SynDBZeos?

mpv · 2020-03-19 19:54:30

I'll check your commits.
On my laptop for this test

QUERIES: array [0..3] of RawUTF8 = (
    'DROP SCHEMA IF EXISTS aaa CASCADE; DROP SCHEMA IF EXISTS bbb CASCADE;', // no result
    'select * from uba_subject us, uba_subject us1 where us.ID > ?', // 12 columns; 300 rows
    'select * from uba_user where ID > ?', // 22 columnus of different types; 5 rows
    'select id, randomNUmber from World where id = ?' // 2 columns; 1 row 
  );

multy-thread loop:
---------
query := conn.NewStatementPrepared(QUERIES[i], i <> 0 {resultExpected});
if (query = nil) then
  raise ESQLDBPostgres.CreateUTF8('Prepare call error: %', [conn.LastErrorMessage]);
if i <> 0 then
  query.Bind(1, ID);
query.ExecutePrepared;
if i <> 0 then
   resJSON := query.FetchAllAsJSON(true);
query := nil;

Speed is about +10% and -10% memory compared to SynDBZeos. But this is hardware limitation (pgbensh in select mode shows 37500 q/s). On normal hardware difference should be a lot more

Warm up....
Start test with 4thread x10000 iteration x4 queries
	Thread 1 finised with 7761 q/s
	Thread 4 finised with 7759 q/s
	Thread 2 finised with 7754 q/s
	Thread 3 finised with 7752 q/s
Finish 4 thread Zeos. Total speed is 31003 q/s
	Thread 1 finised with 8756 q/s
	Thread 3 finised with 8748 q/s
	Thread 4 finised with 8745 q/s
	Thread 2 finised with 8739 q/s
Finish 4 thread SynDBPG. Total speed is 34956 q/s

I still do not have access to a good hardware for testing (for this case and for prev. sockets optimizations) - hope next week got it

ab · 2020-03-19 20:12:42

Very nice!

mpv · 2020-03-19 20:22:47

All my tests passed with latest commits.
Now I need to fix a logging ( this commit broke Oracle logging for me; New SynDBPosgres logging is also differ from other SynDB* and brokes some of my workflows).
I carefully do it tomorrow and give a pull request (I tried to explain what I need in post but too many letters required - I better show you a code )

ab · 2020-03-20 14:56:56

The Oracle commit was needed because there was some regression introducing some time ago.
The client-side statements have been disabled on Oracle in profit to server-side statements. It is much more stable, and performant.
I don't know when, but sometimes this modification was reverted and the repository was not in synch with my own version.

I have made another refactoring to SynDBPostgres.pas.
Please check https://synopse.info/fossil/info/02d4c92dfa
It includes a minimal libpq library wrapper, and a new TSQLDBPostgresStatement.ColumnsToJSON which would benefit of JSON resultsets from SELECT statements.
As a result, it should be even faster, and now compiles with Delphi.

mpv · 2020-03-20 15:23:25

About Oracle - we disable client side statement in our overrides for Oracle many years ago. Problems with Oracle is in logging output (i am preparing a patch)

About TSQLDBPostgresStatement.ColumnsToJSON - you read my mind - just about to do the same.

About namespacing a library functions (PQgetisnull -> PQ.GetIsNull) I not share your opinion. In case we use the same names as in original library (as in original FPC Postgres unit) it's much easy to search for documentation/examples for functions. But it's up to you.

I'll test your changes...

BTW - I got access to a VM with dedicated(not shared with other VMs on this host) 12 Xeon cores (to emulate 3VM x 4Core: one for app server one for DB and one for load generator). I need some time to setup all and will provide a results for both socket and PG improvements ASAP

Last edited by mpv (2020-03-20 15:25:57)

mpv · 2020-03-20 16:22:00

Having a dedicated TSQLDBPostgresConnection.PrepareCached function is a good idea. We may need a MRU prepared statement cache in the future (in Oracle it's implemented inside OCI and transparent for us, but it exists) - I not sure yet - need to test current implementation on real production to understand. PrepareCached will be a good place to implement it

danielkuettner · 2020-03-20 16:59:56

I've thought you write a pg driver without libpg and not a wrapper?

Márcio Baroni · 2020-03-20 18:08:41

Running tests using Synopse mORMot framework 1.18.5847, compiled with Delphi XE5 32 bit, against SQLite 3.31.0, at 2020-03-20 15:02:01.

Insertion speed

			Direct	Batch	Trans	Batch Trans
ZEOS PostgreSQL		4459	105540	7334	106571
Postgres PostgreSQL	3938	50373	11503	59945


Read speed

			By one	All Virtual	All Direct
ZEOS PostgreSQL		6660	169629		292431
Postgres PostgreSQL	12767	229042		594883

i5-4440S CPU @ 2.80GHz 4 core 8GB RAM 240GB SSD

Last edited by Márcio Baroni (2020-03-20 18:12:28)

mpv · 2020-03-20 18:40:46

danielkuettner wrote:

I've thought you write a pg driver without libpg and not a wrapper?

libpg is optimized during many years of development by many contributors. It's open source. Interface is very (VERY) clear (just compare to Oracle nightmare). It's give a full control over all postgres protocol features with a perfect speed.

ab · 2020-03-20 19:28:00

Please check my latest commits about logging: https://github.com/synopse/mORMot/commi … 721f31b4fb

I don't understand why insertion is slower with SynDBPostgres than with SynDBZeos...
Worth investigating...
But SELECT/reading being almost twice faster than SynDBZeos is a huge benefit!
I guess TSQLDBPostgresStatement.ColumnsToJSON helped a lot.

mpv · 2020-03-20 19:58:07

@ab - you are so quick. My problem with logging what sometimes it too verbose and sometimes different SynDB classes produce different logs. I review your latest commits.

My regression fails for current TSQLDBPostgresStatement.ColumnsToJSON (if commented - all is OK). For a first view because of Dates and Doubles. I will debug tomorrow - too tired today.

Insertion may be slow because Batch mode do not starts transaction explicitly? If "start transaction" is not sent Postgres will commit after each statement.
Also postgres MAY can sent notices (if we do not call PQ.SetNoticeProcessor such notices will go to stdout what is very slow on windows).

ab · 2020-03-20 22:41:31

@mpv Please try https://synopse.info/fossil/info/26b0a8b1a9391c84

About logging, now maximum log length should be consistent with all units.
Use property TSQLDBConnectionProperties.LoggedSQLMaxSize to set the max number of bytes for sllSQL log lines.
Now by default it is 2048 (2KB) but you can reduce or increase it if needed for a given connection.

danielkuettner · 2020-03-21 08:37:28

We always using batch for inserting in our project and Michael has optimized it for Zeos a year ago. We debugged libpq and have implemented the binary protocol. Thats very fast and efficient. I think also the read performance of Zeos couldn't be topped in case of using native Zeos (IZConnection, IZPreparedStatement etc.)
You use libpq and Zeos does. If there is a speed improvement than it's not from avoiding Zeos.

mpv · 2020-03-21 10:34:58

danielkuettner wrote:

We debugged libpq and have implemented the binary protocol

There is almost no difference between binary and text protocol, the only thing is fetching a BLOBs - in case of text protocol we need to decode blob content wile fetch data. (IMHO storing BLOB in DB is not scalable at all)
For int64 for example the size of text representation of 123456 number is 7 (6 +#0) bytes, while in binary protocol 8 bytes. Time of converting PChar -> int64 is very small (near the same as htonl we need for binary protocol)

mpv · 2020-03-21 10:53:36

@ab - we should re-think once more how SynDB level cache works. I finally found the reason why I set *Properties.UseCache to false in all my descendants (for a years I don't understand why production deployments not works with this option enabled)

The problem:
- we execute a statement -> statement holds a cursor -> we put statement into cache with opened cursor -> while we do not execute a statement again we do not call a Stmt.Reset, so cursor remains opened. Many opened cursors kills a database and app server memory (for example Postgres cursor holds all rows if not in singe row mode).

The only possible solution I see is to call Stmt.Reset; manually after fetch all rows in my application code.

I'm at a loss... May be you have some other ideas?

ab · 2020-03-21 12:10:58

@mpv
Could you please check https://synopse.info/fossil/info/435c551d32 ?

It should have made much faster PostgreSQL BLOB process.
And libpg>=8.3 is not required any more, since PQunescapeBytea() is replaced by our faster HexToBin/OctToBin decoding (with no memory allocation).

About UseCache=false, I need to think about it...
And why not calling Stmt.Reset when all rows have been fetched?

mpv · 2020-03-21 13:02:27

Now when I understand a problem I can call Reset after fetching But problem is hidden so deep what can cause a troubles for other developers.

I found solution for Batch insertion in Postgres (I don't use it because I need a full control over each insert).
Instead of construct a query in MultipleValuesInsert like this:

insert into mytable(a, b) values ($1, $2), ($3, %4), ....

It overflows Postgres prepared statements pool because count of element can be 1, 2, 3, ....

we can use array binding as such:

insert into mytable(a, b) select unnest(?::NUMERIC[]) as a , unnest(?::text[]) as b

and bind $1 and $2 as we already did for arrays. May be even unlimited length of array is supported, at last I don't found a limits.

This code is works (an demonstrate BUG in ColumnsToJSON - see "a":2!!!! - there is a garbage where (non-printable symbol) - I can' fount yet - @ab - please look on it

 
  cPool2 := TSQLDBPostgresConnectionProperties.Create('', 'postgresql://localhost:5432/postgres', 'user', 'pwd');
  conn := cPool2.ThreadSafeConnection;
  query := conn.NewStatementPrepared('select unnest(?::NUMERIC[]) as a , unnest(?::text[]) as b', true);
  if (query = nil) then
    raise ESQLDBPostgres.CreateUTF8('Prepare call error: %', [conn.LastErrorMessage]);
  query.BindArray(1, [1, 2, 3]);
  query.BindArray(2, ['str1', 'str2', 'str3']);
  query.ExecutePrepared;
   writeln(query.FetchAllAsJSON(true)); //[{"a":1,"b":"str1"},{"a":2!!!!,"b":"str2"},{"a":3,"b":"str3"}]

Insertion also works

  query := conn.NewStatementPrepared('insert into test_insert(id, code) select unnest(?::NUMERIC[]) as a , unnest(?::text[]) as b', false);
  query.BindArray(1, [1, 2, 3]);
  query.BindArray(2, ['str1', 'str2', 'str3']);
  query.ExecutePrepared;

Last edited by mpv (2020-03-21 13:08:47)

mpv · 2020-03-21 13:22:51

oooo!!! Postgre is SUPER. Insertion of 100 000 elements in batch - 180ms:

var
  a: array[0..100000] of int64;
  b: array[0..100000] of RawUTF8;

  tmTotal.Start;
  query := conn.NewStatementPrepared('insert into benchmarkdbuse.test_insert(id, code) select unnest(?::NUMERIC[]) as a , unnest(?::text[]) as b', false);
  for i := 0 to 100000 do begin
    a[i] := i;
    b[i] := IntToString(i);
  end;
  query.BindArray(1, a);
  query.BindArray(2, b);
  query.ExecutePrepared;
  writeln(tmTotal.stop);

$ ./postgres_direct
181.32ms

elect count(*) from test_insert
230014

Last edited by mpv (2020-03-21 13:23:47)

ab · 2020-03-21 13:27:22

danielkuettner wrote:

We debugged libpq and have implemented the binary protocol.

I did not find where it has been implemented in the sources.
ZPlainPostgreSqlDriver.pas uses libpq

danielkuettner wrote:

I think also the read performance of Zeos couldn't be topped in case of using native Zeos (IZConnection, IZPreparedStatement etc.)

Perhaps Marcio benchmark https://synopse.info/forum/viewtopic.ph … 442#p31442 didn't use mORMot enhanced methods like TZAbstractPostgreSQLResultSet.ColumnsToJSON: was USE_SYNCOMMONS conditional set?

danielkuettner wrote:

You use libpq and Zeos does. If there is a speed improvement than it's not from avoiding Zeos.

I have seen perhaps several ways to improve ZDBC driver: e.g. don't call StrLen() before calling TJsonWriter.AddJsonEscape() or when PQgetlength() should be more efficient.

mpv · 2020-03-21 13:27:24

Crazy things - insertion of ONE MILLION elements = 2sec. In's faster when Oracle bulk loader

danielkuettner · 2020-03-21 13:52:28

@ab I don't know if Michael has committed it to 7.3.
-> I've asked him, he has

Last edited by danielkuettner (2020-03-21 14:03:41)

ab · 2020-03-21 14:23:20

@daniel
I didn't see in which unit.

@mpv
Please try https://synopse.info/fossil/info/a5bd32116a
It should fix some problems.

danielkuettner · 2020-03-21 14:47:34

I'm not so deep in the Zeos framework,
but a quick search at ZbcPostgresSqlStatemant.pas give me ExecuteDMLBatchWithUnnestVarlenaArrays.

mpv · 2020-03-21 14:52:35

Yes, now everything is OK with JSON except Date: postgres returns date in non-strict ISO8601 format (use space instead of T between date and time part). Some parsed can't convert it back (SpiderMonkey 52 can't) or do it wrong (incorrect time zones parsing etc) - we have too many errors in past with it;
Can you, please add a case for ftDate:

ftDate: begin
  l := PQ.GetLength(fRes, fCurrentRow, col);
  if (L > 10) and (PAnsiChar(P)+10)^ = ' ') then // strict ISO 8601 with T
    (PAnsiChar(P)+10)^ := 'T';
  WR.Add('"');
  WR.AddNoJSONEscape(P);
  WR.Add('"');
end;

After this all my regressions is OK

Last edited by mpv (2020-03-21 14:52:56)

ab · 2020-03-21 14:59:41

I forgot about non mORMot decoders like SM. Sorry.
Please check https://synopse.info/fossil/info/cab8be4f76

Did you test the blob process too?

ab · 2020-03-21 15:03:02

@daniel
Thanks for the feedback!

ExecuteDMLBatchWithUnnestVarlenaArrays() is not about the binary protocol and bypassing libpq.

It is about bulk INSERT using arrays via:

insert into .... select unnest(....)

Just like mpv was writing about in this thread.

I will try to add this feature to SynDBPostgress too.

ab · 2020-03-21 15:24:27

@mpv
Do you know if PostgreSQL supports:

delete from .... select unnest(....)

update .... set .... select unnest(....)

in addition to

insert into .... select unnest(....)

mpv · 2020-03-21 15:41:04

For delete we can use the same syntax as for select

delete from table where id=ANY(?)

Please, merge this - #291; It allow me to remove many lines of code in my app: because I need to do it for each class

I investigate update

mpv · 2020-03-21 16:04:53

Yes! update is possible also

update test_insert as t set
    code = v.code,
    code2 = v.code2
from (
  SELECT
    unnest(?::NUMERIC[]),
    unnest(?::text[]),
    unnest(?::text[])
 ) as v(id, code, code2)
where t.id = v.id;

And bind 3 array = array of ID's and 2 array for values

danielkuettner · 2020-03-21 16:06:37

ab wrote:

ExecuteDMLBatchWithUnnestVarlenaArrays() is not about the binary protocol and bypassing libpq.

Misunderstanding.
If you use the binary protocol of pg and you want to use array bindings then you need ExecuteDMLBatchWithUnnestVarlenaArrays (and it was rarely documented).

Bypassing libpq is just a aim and has nothing to do with this

ab · 2020-03-21 16:27:06

@daniel
Fair enough!

@mpv
I have merged your request.

Could you check https://github.com/synopse/mORMot/tree/ … erformance ?
It should generate such requests for our ORM:

insert into benchmarkdbuse.test_insert(id, code) values (unnest(?::numeric[]),unnest(?::text[]))

It is not exactly your SQL (there is no SELECT involved) but it should work as it is used by Zeos.
I didn't test it (I don't have any PostgreSQL installed) but it could increase a lot the performance of our ORM.
Could you please tell me if this 'insert' syntax works on your side?
If you can, could you run the Sample https://github.com/synopse/mORMot/tree/ … erformance ?

mpv · 2020-03-21 16:28:33

@danielkuettner - I spoke with Zeos maintainer may be a year ago (and even give small fixes to Zeos). I know how Postgres binary protocol implemented in Zeos (it's broke my application in some very specific places, so I currently on old zeos version).
Believe me - binary protocol give us nothing in terms of performance or memory usage. Just a PITAs with OIDs/htonl/doubles etc.

The main Zeos (Postgres) performance bottleneck (we fix here in SynDBZeos) is:
- too many interfaces (I mean delphi interfaces)
- too generic SQL statement parser (it perfect, can parse almost any SQL syntax but the cost is complexity)
- too many strings (big problem for heavy muti-thread)

All of this is goes from generic Zeos nature. I spend several days in Zeos codes to found a ways for optimization and decide to wrote a SynDBPostgres from scratch, because too big regression in Zeos if I touch something where.

Last edited by mpv (2020-03-21 16:48:36)

mpv · 2020-03-21 16:36:43

ab wrote:

If you can, could you run the Sample https://github.com/synopse/mORMot/tree/ … erformance ?

I try to adopt it for Linux just now. Please, change the default port to 8881 in mORMot2 - ports under 1000 (I mean 881) on Linux is reservet for well known services and require root

Last edited by mpv (2020-03-21 16:37:07)

ab · 2020-03-21 17:00:42

Yes, most samples were Windows/Delphi only...

I have implemented bulk updates and bulk inserts over SynDBPosgres for our ORM:
https://synopse.info/fossil/info/9fe77d6f21

Not tested yet!

mpv · 2020-03-21 17:23:45

On my linux laptop (w/o last commit)

Insertion speed

 	         Direct	Batch	Trans	Batch Trans
ZEOS PostgreSQL	8255	110521	9097	261137
Syn PostgreSQL	9814	119697	7400	223443

Read speed

 	       By one	All Virtual	All Direct
ZEOS PostgreSQL	600889	7739938	        11389521
Syn PostgreSQL	612294	12048192	21459227

mpv · 2020-03-21 17:30:52

Latest commits

Running tests using Synopse mORMot framework 1.18.5865, compiled with Free Pascal 3.2 64 bit, against SQLite 3.31.0, at 2020-03-21 19:28:09.

Insertion speed
    	      Direct	Batch	Trans	Batch Trans
ZEOS PostgreSQL	8201	279939	9351	172491
Syn PostgreSQL	9134	320102	11243	124872

Read speed
 	      By one	All Virtual	All Direct
ZEOS PostgreSQL	385802	5787037	 8000000
Syn PostgreSQL	493534	10822510 17064846

mpv · 2020-03-21 17:32:12

No - it's totally impossible to do perf. test on laptop. Trottling kills any measurements. But see the difference between Windows test and Linux. Linux almost x3 faster on insertion and x50 on selects (Windows port of Postgres is for testing only). BTW I run Potgres using official docker container..

Last edited by mpv (2020-03-21 17:50:59)

ab · 2020-03-21 17:57:02

You are right. No absolute numbers!
But what we could compare is Zeos vs SynDBPostgres.

Reading is always faster: by a huge amount - up to twice faster!
21459227 is 21,459,227 rows per seconds? what?????

Writing is almost the same, slower only within 'Batch Trans'.
I will try to setup a PG DB and see if everything is as expected.

mpv · 2020-03-21 18:08:11

There is indeed something wrong in test suit. I increase iteration count x20 tims

if (Mode=smFull) and not aUseTransactions then
          Stat.fNumberOfElements := 10000 else // SQLite3 is dead slow without transactions
          Stat.fNumberOfElements := 100000;

and numbers for select become absolutely unreal (insertion is on the same level)

Postgres PostgreSQL	600312	182149362	314465408

I use this docker image: https://hub.docker.com/_/postgres On Linux its takes 5 minutes to setup

Last edited by mpv (2020-03-21 18:09:06)

ab · 2020-03-21 18:24:22

I guess there is something wrong with the timing on Linux, and how the performance is computed.

EgonHugeist · 2020-03-23 07:40:19

Nice job Pavel.

mpv wrote:

Believe me - binary protocol give us nothing in terms of performance or memory usage. Just a PITAs with OIDs/htonl/doubles etc.

Well this might be true for you example. Nowaday those numbers do grow quickly. The more the pg date(4Byte)/time(8Byte)/timestamps(BByte)/doubles/numerics are samller than using strings..

mpv wrote:

@danielkuettner - I spoke with Zeos maintainer may be a year ago (and even give small fixes to Zeos). I know how Postgres binary protocol implemented in Zeos (it's broke my application in some very specific places, so I currently on old zeos version).

that is true but the binary wire protocol is optional except for parameter bindings. You mean this https://sourceforge.net/p/zeoslib/tickets/336/ propably.

mpv wrote:

The main Zeos (Postgres) performance bottleneck (we fix here in SynDBZeos) is:
- too many interfaces (I mean delphi interfaces)

The interfaces indeed had been a problem where we used virtual methods. The compiler did produce duplicate offset calcultations for thre vmt's. Inbetween zeos does no longer use the virtual methods any more in most relevant places. So the zeos access using interface has the same performance like using objects. Thus this argument does not count for me.

mpv wrote:

The main Zeos (Postgres) performance bottleneck (we fix here in SynDBZeos) is:
- too generic SQL statement parser (it perfect, can parse almost any SQL syntax but the cost is complexity)

This parser is also no longer slow on 7.3. I've added pchar markers instead. No memallocs are done except to hold the markers in a list. The parser works perfectly to replace the question-marks for the oracle and postgres. Looking to ab's approch it's a nice nouvice state. Zeos is full of examples where his technick would fail propably.

mpv wrote:

The main Zeos (Postgres) performance bottleneck (we fix here in SynDBZeos) is:
- too many strings (big problem for heavy muti-thread)

Now you make me curious. To many strings? While you using the string result and binding techniks? What do you mean?

mpv wrote:

Believe me - binary protocol give us nothing in terms of performance or memory usage. Just a PITAs with OIDs/htonl/doubles etc.

Yesterday i found the reason for the performance penalty. I've already commited the changes. So on my side SynDBZeos is slightly faster.

The more SynDBZeos is outdated. I send AB several mails with fixes but all of them has been ignored last year.
Relevant changes of the my maintained SynDBZeos:
-StartTransaction(including nesteds) so the AutoCommit mode is not required to use any more.
-full outparm support for SP's without using the IZCallableStatement
-better ColumnsToJSON with several JSON serialization option (manly for the date values)
-correctly determine batch ababilities on connect so the define is obsolete.

@AB is https://synopse.info/fossil/info/9fe77d6f21 realy necessary? No SQL standard any more? Is that syntax supported by UniDac/FireDac? That broke all zeos compatibilty.

Last edited by EgonHugeist (2020-03-23 07:55:08)

mpv · 2020-03-23 08:39:35

Michael, Zeos if perfect - we use it for many years. You did a huge work for a last years. The Zeos performance is also very good for 99,9% of applications.

As I wrote before SynDBPostgres in not so powerful as Zeos, but it's much simple, therefore easy to optimize (IMHO). For me the main goal of SynDBZeos is heavy multi-thread apps - like this ~500 000 query per second

For such apps every implementation detail make sense.... And we can help each other (like with this otimization your apply to Zeos).

Now you make me curious. To many strings? While you using the string result and binding techniks? What do you mean?

Sorry for confusing - I meant "too many string concatenations / assigning". Again - I not sure about latest Zeos state.

the binary wire protocol is optional except for parameter bindings. You mean this https://sourceforge.net/p/zeoslib/tickets/336/ propably

Yes. And I understand why you do not apply this patch.

P.S.
The worst of making patch to Zeos for me is a subversion. Git itself and GitLab \ GitHub is much more powerfull when svn + sotreforge. I use this Zeos git mirror for example because I need to add all my project dependencies as a git sub-modules to have a full control over versions.
May be the same with SynDBZeos patches? I mean review of patch is mush easy when it's done using github pull request

ab · 2020-03-23 09:38:15

About the whole point of SynDBPostgres.
I agree with mpv approach. It was never a replacement to SynDBZeos, but an alternative.
There always will be benefits for alternatives. I like very much https://github.com/github/putty/blob/master/sshzlib.c approach.

A lot of our users tend to rely on mORMot for long-term projects.
Having alternatives is confusing at first (they are several ways of accessing the same DB), but it tends to benefit for the long term.
We would like to continue SynDBZeos support as the primary DB layer for mORMot. It is much better than ODBC or others.
But as mORMot is not maintained by me only today (mpv and others make tremendous contributions), some of our users told us that depending on Zeos only may be a risk.
So we always provided alternatives.

Zeos/ZDBC is a great library for sure. What Michael did in terms of performances and features is awesome. He took a slowly supported and bloated library into something huge but fast. The best DB layer for Delphi/FPC for sure.
But ZDBC is a big and complex beast, mainly due to having support a lot of native DB types, compatibility with old code, and all kind of metadata and features we don't need from mORMot ORM.
Adding ColumnsToJSON() support in ZDBC was the nicest addition I have never seen in any DB library - even more since it fulfilled our ORM needs, and integrated our SynCommons unit!

Zeos with mORMot is not obvious to setup, since it is a 3rd party.
PostgreSQL is our business-class RDBMS of choice, so being able to access and try it with no dependency is a benefit.
But there is nothing stopping the user to try SynDBZeos and use it if he/she wants to. On the contrary. If SynDBZeos is faster/better/shiner: just use it!

We also switched to FPC/Linux as main platform for server-side, and we can easier optimize SynDBPostgres code on this target than the huge code base of ZDBC, which is more Delphi-centric for good reasons.
For instance, the FPC heap is much slower than FastMM, so any additional allocation hurts. ZDBC tends to allocate more than directly access the libpq library - even if Michael did a great job reducing memory consumption. Switching from FPC heap to other libraries like glibc, Intel TBB or jemalloc is not a viable choice, since it induces unstability in the execution: the slightest memory problem just abort the executable - check https://github.com/synopse/mORMot/blob/ … ligned.pas
Now, we rather benchmark and tune for FPC and Linux - not Delphi and Windows.

In terms of performance and stability, bypassing ZDBC sometimes help.
For instance, for our ORM, our SQlite3 direct layer is faster (by a huge margin), and will always be faster than using ZDBC. It not only by-passes ZDBC, but also our SynDB layer.

About binary/text with libpq:
- text is much easier to implement and maintain - this is one of the reasons why SynDBPostgres is so small and was quick to develop;
- binary support is complex and slows down things when used in ColumnToJSON() - letting PG directly write its text values into JSON is easier and faster; and our ORM uses mainly ColumnToJSON() - so ZDbcPostgreSqlResultSet version helps but is not mandatory.

EgonHugeist wrote:

Looking to ab's approch it's a nice nouvice state. Zeos is full of examples where his technick would fail propably.

Could you please elaborate? I don't understand what you mean.

EgonHugeist wrote:

The more SynDBZeos is outdated. I send AB several mails with fixes but all of them has been ignored last year.

I just found one mail lost in my mailbox, from Thu, 7 Mar 2019. Tempus fugit!
At that time, I forgot to look into it. Sorry.
A github pull request is always better to manage and integrate.
Do you have anything new / more up-to-date?

EgonHugeist wrote:

is https://synopse.info/fossil/info/9fe77d6f21 realy necessary?

It is in the ORM layer, and only enabled for SynDBPostgres, not for ZDBC in the current SynDBZeos state.
In short, it generates some SQL as ExecuteDMLBatchWithUnnestVarlenaArrays() does, and we also added bulk UPDATE and bulk DELETE support for our ORM.
I guess that your modifications which changed the Batch abilities detection in SynDBZeos are not compatible with this new behavior of the ORM. Detecting the batch abilities in SynDBZeos.pas broke the expectations of mORMotDB.pas.
With the current official SynDBZeos unit, this 9fe77d6f21 patch didn't break anything.

EgonHugeist wrote:

Yesterday i found the reason for the performance penalty. I've already committed the changes. So on my side SynDBZeos is slightly faster.

It is clear than alternatives, like SynDBPostgres, are a good thing (tm).
Alternatives challenge us, and help to make better code.

ab · 2020-03-23 11:43:57

@Michael
I have included your last year patch with https://synopse.info/fossil/info/6e922057bc
It should also let SynDBProgres custom SQL from the ORM used only for this unit, and don't mess with SynDBZeos array binding.

EgonHugeist · 2020-03-26 07:19:54

@AB,

hi thanks. Good start. I'm not so familar with Git... Oldscool svn user.
I know about the forum rules so please consider: https://www.file-upload.net/download-13 … patch.html
this patch includes:
- real transaction handling for zeos
- adds possibility for parameters if they are added to the databasename (starting with '?') see TSQLDBZEOSConnectionProperties.URI
- changes TSQLDBZEOSStatement.ReleaseRows. IIUC is purpose to free memory in background. That's supported since 7.2 already
- fixes possible wrong MatchPatterns for MetaInformations.

ab wrote:

About the whole point of SynDBPostgres.
I agree with mpv approach. It was never a replacement to SynDBZeos, but an alternative.
There always will be benefits for alternatives. I like very much https://github.com/github/putty/blob/master/sshzlib.c approach.

I do understand yours POV. It's not that i'm against that. It just feels like wasting valuable manpower. From my POV it's not task of your project to embedd all server access technics. Usually this is a neverending story. The servers change behaviors, they add funtionality so it's a permantent maintaing job which will never end.

ab wrote:

A lot of our users tend to rely on mORMot for long-term projects.
Having alternatives is confusing at first (they are several ways of accessing the same DB), but it tends to benefit for the long term.
We would like to continue SynDBZeos support as the primary DB layer for mORMot. It is much better than ODBC or others.
But as mORMot is not maintained by me only today (mpv and others make tremendous contributions), some of our users told us that depending on Zeos only may be a risk.
So we always provided alternatives.

That's why i am so sensitive . Those guys are not uptodate what happend in background enbetween. They should open there mind and give zeos a fair chance.

ab wrote:

Zeos/ZDBC is a great library for sure. What Michael did in terms of performances and features is awesome. He took a slowly supported and bloated library into something huge but fast. The best DB layer for Delphi/FPC for sure.
But ZDBC is a big and complex beast, mainly due to having support a lot of native DB types, compatibility with old code, and all kind of metadata and features we don't need from mORMot ORM.
Adding ColumnsToJSON() support in ZDBC was the nicest addition I have never seen in any DB library - even more since it fulfilled our ORM needs, and integrated our SynCommons unit!
Zeos with mORMot is not obvious to setup, since it is a 3rd party.
PostgreSQL is our business-class RDBMS of choice, so being able to access and try it with no dependency is a benefit.
But there is nothing stopping the user to try SynDBZeos and use it if he/she wants to. On the contrary. If SynDBZeos is faster/better/shiner: just use it!

thanks for the nice words, guys. We alse keep FPC support high.

ab wrote:

We also switched to FPC/Linux as main platform for server-side, and we can easier optimize SynDBPostgres code on this target than the huge code base of ZDBC, which is more Delphi-centric for good reasons.
For instance, the FPC heap is much slower than FastMM, so any additional allocation hurts. ZDBC tends to allocate more than directly access the libpq library - even if Michael did a great job reducing memory consumption. Switching from FPC heap to other libraries like glibc, Intel TBB or jemalloc is not a viable choice, since it induces unstability in the execution: the slightest memory problem just abort the executable - check https://github.com/synopse/mORMot/blob/ … ligned.pas
Now, we rather benchmark and tune for FPC and Linux - not Delphi and Windows.
In terms of performance and stability, bypassing ZDBC sometimes help.
For instance, for our ORM, our SQlite3 direct layer is faster (by a huge margin), and will always be faster than using ZDBC. It not only by-passes ZDBC, but also our SynDB layer.

Probably true. If you have applied the patch than you could add the commented code the the benchmark tests (see TSQLDBZEOSConnectionProperties.URI)
so you can compare your external SQLite with Zeos in all your modes and you have fair benchmarks.

ab wrote:

About binary/text with libpq:
- text is much easier to implement and maintain - this is one of the reasons why SynDBPostgres is so small and was quick to develop;
- binary support is complex and slows down things when used in ColumnToJSON() - letting PG directly write its text values into JSON is easier and faster; and our ORM uses mainly ColumnToJSON() - so ZDbcPostgreSqlResultSet version helps but is not mandatory.

you can simply turn ON/OFF the binary wire by Adding

'BinaryWireResultMode=true or false'

. See the stmt parameter section in SynDBCZeos.pas. Accordin the code size, yes i'm not happy about it too. It is huge because we keep all backward compatibility high. We have users how are using old MySQL3 or FB1 or PosgreSQL7. But as you pointed out there is no performance penalty..

ab wrote:

EgonHugeist wrote:
Looking to ab's approch it's a nice nouvice state. Zeos is full of examples where his technick would fail propably.
Could you please elaborate? I don't understand what you mean.

Zeos is gangled by obscure tests. There are users who are doing things i would never do but it's not forbitten and should be support. An example for an SQL string:

    '/*?? do we find these ?question-marks? as parameter ? */'+
    'select * from people?2 where p_id > ?'+LineEnding+
    'and p?red = ?'+LineEnding+
    '/* ? and those marks? Are they ignored too?'+LineEnding+
    '? Are they ignored on a multi-line comment as well?*/'+LineEnding+
    '-- ? and those marks? Are they ignored too? On a single line comment?'

All servers allow such *beep*. Test your questionmark replacement. There are exacly two params to replace..

I'll remove the old ColumnToJSON overload soon. So thx for lately paches and the test do run again. Even if current SynDBPostgres dos not find libpg on my system and i have no PG-instance runinng on the default port. There is work left, Pavel.

ab wrote:

Alternatives challenge us, and help to make better code

It's not everything for performance. But i'll and if i can help i'll do it. Hope you guys too. ZeosLib hab bad critics according performance. Most of them from the TDataSet users, but from you as well. And they where right in the past. Inbetween also the TDataSet layer did speed up very nice. Everything which is faster on dbc is logically faster for the TDataSets as well. I've no other compents to compare but i would start from the premisse ZeosDBO has a nice comparable performance inbetween(And that's for free) even if all other components are optimized for the DS approach only.

Another reason for the binary wire was to suppress some issues PG has with the strings. F.e.:

insert into double_values values('-2.2250738585072014E-308'),('-1.7976931348623157E+308'),('2.2250738585072014E-308')....

you can send/retrieve them. In the past we had the Extended type in our code. I broke with the type since CPU64 doesn't have the REAL10 any more. So we can not read back the value with the string protocol with our native StrToFloat using double precision. There is exacty one digit to much before the exponent.

Last edited by EgonHugeist (2020-03-26 08:37:09)

ab · 2020-03-26 09:00:20

Is BinaryWireResultMode true by default?
Should we enable it?

I have and use libpq.so.5 on my Ubuntu system.
Zeos is not able to find it. I have to specify the library name in the connection URI. See https://github.com/synopse/mORMot/blob/ … s.pas#L709
So it depends on the actual system it runs on...
On your system, you can specify libpq.so.4 I guess in the SynDBPostgresLibrary global variable.
I have enhanced TSQLDBPostgresLib to automatically try to load the .so.5 then the .so.4 - see https://synopse.info/fossil/info/d0d2158e9f

We won't support *beep* SQL with bulky comments in our unit.
We are not a one-to-one replacement for TDataSet / SQL statements, but support the SQL generated by our ORM, and dedicated clean SQL.
If there is too much SQL to rewrite, with a lot of *beep* comments in it - just skip SynDB and use Zeos.
I may add a way to remove /* */ comments in any kind of SQL... But the ? as part of the table/column name - no thanks!
The same about 'extended' 80-bit float support - we have a single floating point type, which is double. So we don't have to circumvent this problem.

mORMot Open Source

#1 2020-03-18 17:41:36

SynDBPostgres - implementation of TSQLDB* for Postgres using libpg

#2 2020-03-19 09:42:22

Re: SynDBPostgres - implementation of TSQLDB* for Postgres using libpg

#3 2020-03-19 18:31:26

Re: SynDBPostgres - implementation of TSQLDB* for Postgres using libpg

#4 2020-03-19 19:06:07

Re: SynDBPostgres - implementation of TSQLDB* for Postgres using libpg

#5 2020-03-19 19:54:30

Re: SynDBPostgres - implementation of TSQLDB* for Postgres using libpg

#6 2020-03-19 20:12:42

Re: SynDBPostgres - implementation of TSQLDB* for Postgres using libpg

#7 2020-03-19 20:22:47

Re: SynDBPostgres - implementation of TSQLDB* for Postgres using libpg

#8 2020-03-20 14:56:56

Re: SynDBPostgres - implementation of TSQLDB* for Postgres using libpg

#9 2020-03-20 15:23:25

Re: SynDBPostgres - implementation of TSQLDB* for Postgres using libpg

#10 2020-03-20 16:22:00

Re: SynDBPostgres - implementation of TSQLDB* for Postgres using libpg

#11 2020-03-20 16:59:56

Re: SynDBPostgres - implementation of TSQLDB* for Postgres using libpg

#12 2020-03-20 18:08:41

Re: SynDBPostgres - implementation of TSQLDB* for Postgres using libpg

#13 2020-03-20 18:40:46

Re: SynDBPostgres - implementation of TSQLDB* for Postgres using libpg

#14 2020-03-20 19:28:00

Re: SynDBPostgres - implementation of TSQLDB* for Postgres using libpg

#15 2020-03-20 19:58:07

Re: SynDBPostgres - implementation of TSQLDB* for Postgres using libpg

#16 2020-03-20 22:41:31

Re: SynDBPostgres - implementation of TSQLDB* for Postgres using libpg

#17 2020-03-21 08:37:28

Re: SynDBPostgres - implementation of TSQLDB* for Postgres using libpg

#18 2020-03-21 10:34:58

Re: SynDBPostgres - implementation of TSQLDB* for Postgres using libpg

#19 2020-03-21 10:53:36

Re: SynDBPostgres - implementation of TSQLDB* for Postgres using libpg

#20 2020-03-21 12:10:58

Re: SynDBPostgres - implementation of TSQLDB* for Postgres using libpg

#21 2020-03-21 13:02:27

Re: SynDBPostgres - implementation of TSQLDB* for Postgres using libpg

#22 2020-03-21 13:22:51

Re: SynDBPostgres - implementation of TSQLDB* for Postgres using libpg

#23 2020-03-21 13:27:22

Re: SynDBPostgres - implementation of TSQLDB* for Postgres using libpg

#24 2020-03-21 13:27:24

Re: SynDBPostgres - implementation of TSQLDB* for Postgres using libpg

#25 2020-03-21 13:52:28

Re: SynDBPostgres - implementation of TSQLDB* for Postgres using libpg

#26 2020-03-21 14:23:20

Re: SynDBPostgres - implementation of TSQLDB* for Postgres using libpg

#27 2020-03-21 14:47:34

Re: SynDBPostgres - implementation of TSQLDB* for Postgres using libpg

#28 2020-03-21 14:52:35

Re: SynDBPostgres - implementation of TSQLDB* for Postgres using libpg

#29 2020-03-21 14:59:41

Re: SynDBPostgres - implementation of TSQLDB* for Postgres using libpg

#30 2020-03-21 15:03:02

Re: SynDBPostgres - implementation of TSQLDB* for Postgres using libpg

#31 2020-03-21 15:24:27

Re: SynDBPostgres - implementation of TSQLDB* for Postgres using libpg

#32 2020-03-21 15:41:04

Re: SynDBPostgres - implementation of TSQLDB* for Postgres using libpg

#33 2020-03-21 16:04:53

Re: SynDBPostgres - implementation of TSQLDB* for Postgres using libpg

#34 2020-03-21 16:06:37

Re: SynDBPostgres - implementation of TSQLDB* for Postgres using libpg

#35 2020-03-21 16:27:06

Re: SynDBPostgres - implementation of TSQLDB* for Postgres using libpg

#36 2020-03-21 16:28:33

Re: SynDBPostgres - implementation of TSQLDB* for Postgres using libpg

#37 2020-03-21 16:36:43

Re: SynDBPostgres - implementation of TSQLDB* for Postgres using libpg

#38 2020-03-21 17:00:42

Re: SynDBPostgres - implementation of TSQLDB* for Postgres using libpg

#39 2020-03-21 17:23:45

Re: SynDBPostgres - implementation of TSQLDB* for Postgres using libpg

#40 2020-03-21 17:30:52