#1 2021-03-16 10:01:21

flydev
Member
From: France
Registered: 2020-11-27
Posts: 50
Website

Server crash / NewRun event

Hello all,

Two hours ago I suffered a "crash" and I am trying to find the culprit (me for sure smile) as I am quite new to this framework.

Note: It is possible that timeout are misconfigured and that I misunderstand the flow of the session expiration/client re-connection even if everything can run a week without any error or interruption


The setup is the following:
- Clients and server are built on mORMot2
- a database server (MariaDB 10) on a Windows server
- the (mORMot) server is an instance of TSQLHttpServer and hosted on the same computer as the database server
- the (mORMot) clients (~83 in the wild) are instance of TRestHttpClient and run on Windows 7 embedded

----------------

Server definition and timeout (maybe wrong and I should set it to a greather value) :

constructor TSagasServer.Create(Props: TSQLDBZEOSConnectionProperties);
var
  Conn: TSQLDBConnection;
begin
  fProps := Props;
  fProps.ConnectionTimeOutMinutes := 1;
  [...]
end;

Client definition and timeout (3000 ms)

fClient := TRestHttpClient.Create(AnsiString(SERVER_URL), PORT_NAME, fModel, false, '', '', 3000, 3000, 3000);

----------------

Suddenly, all the clients (~83) disconnected at the same time (by timeout ? from what I see in the client logs) :

client side wrote:

    > 16/03/2021 08:55:00.496    Exception     EWinHttp {Message:"winhttp.dll error 2EE2 (timeout)"} [] at 6af783  stack trace API (TZOracleConnection.PingServer)

And on the server side, 15 seconds before (at 08:54:46) :

server side wrote:

    > 3/16/2021 08:54:45.944    Enter     mormot.rest.sqlite3.TRestServerDB(02b56e10).URI PUT root/Capteurs/10?session_signature=258902d3016d2b0916971c8c in=464 B
    > 3/16/2021 08:54:45.944    User auth        mormot.rest.server.TRestServerRoutingRest(0308cba0) kingspark/629736147 1.2.3.4
    > 3/16/2021 08:54:45.944    SQL     mormot.db.sql.zeos.TSqlDBZEOSStatement(02b76530) ExecutePrepared 1.42ms wr=1 update Capteurs set NumeroConcessionnaire=0,NumeroParking=15,NumeroBorne=1,Etat_Monetique=0,Etat_CB=0,Etat_Imp_Sortie=0,Etat_Imp_Sortie_Papier=0,Etat_Imp_Sortie_Papier_Low=0,Etat_Laser_Entree=1,Etat_Laser_Sortie=0,Etat_Imp_Entree=1,Etat_Imp_Entree_Papier=0,Etat_Imp_Entree_Papier_Low=0,Barriere_Entree_Open=0,Barriere_Sortie_Open=0,Type_SIP=0,Etat_Connexion_SIP=0,WatchDog=0,WatchDog_UserStop=0,Update_time='2021-03-16 08:54:36',Site_Link=0 where ID=10
    > 3/16/2021 08:54:45.944    Server        kingspark 1.2.3.4 PUT root/Capteurs Write=200 out=0 B in 1.56ms
    > 3/16/2021 08:54:45.944    Leave     00.001.574
    > 3/16/2021 08:54:46.240    Enter     mormot.rest.sqlite3.TRestServerDB(02b56e10).URI POST root/UpdateService.Get?session_signature=258902850056589b9976698e in=46 B
    > 3/16/2021 08:54:46.240    User auth        mormot.rest.server.TRestServerRoutingRest(0308cba0) kingspark/629736069 1.2.3.4
    > 3/16/2021 08:54:46.240    Service call        mormot.rest.sqlite3.TRestServerDB(02b56e10) IUpdateService.Get[{numeroParking:15,numeroBorne:3,typeBorne:1}]
    > 3/16/2021 08:54:46.288    New run     ==================================================

server side 5 seconds later wrote:

3/16/2021 08:54:52.208    Exception     EHttpApiServer {LastError:1229,LastApi:"hSendHttpResponse"} [HttpSvr 61337 root THttpApiSvr] at 6c3aa7

About this last error exception, I read on this forum that it could be a timeout misconfiguration..

And the really strange part (to me) is the "New Run" event which poped multiple times each 100ms on the server side - I don't really know at this time what this mean.. please see the following pastebin log file :
   
    > https://pastebin.com/9ZL8ndxj


----------------

To get back running everything, I paniked and restarted the server manualy, the first time I was still getting the execption ERROR_CONNECTION_INVALID.


Any hint please ?   What I can/should add to this post so it will be more clear to you and easier to help ?

Thanks yo guys, have a nice day.

Offline

#2 2021-03-16 16:14:24

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,238
Website

Re: Server crash / NewRun event

IIRC sllNewRun is only called when the service is started/stopped.

So I guess several instances of the service was restarted at that time.
This may be the root (and external) cause to the trouble.

Offline

#3 2021-03-18 14:10:21

flydev
Member
From: France
Registered: 2020-11-27
Posts: 50
Website

Re: Server crash / NewRun event

Thanks for the hint @ab, I reproduced it, in fact, I was opening the log file with the LogViewer while the soft was running.

Offline

#4 2021-03-18 20:09:23

mpv
Member
From: Ukraine
Registered: 2012-03-24
Posts: 1,544
Website

Re: Server crash / NewRun event

Ah, this is a old problem (under Windows) - see this topic: https://synopse.info/forum/viewtopic.php?id=5625
@ab - do we really need a memory mapped files there (TSynLogFile=class(TMemoryMapText)) ?
In any case Log parser loads all file content into memory, so we can read it without memory mapped file I think, close a fd ASAP and this solves a problem....

Currently for Windows instances we always create a copy of a active log file before opening it in LogView to prevent such unexpected errors

Last edited by mpv (2021-03-18 20:14:37)

Offline

#5 2021-03-19 09:18:35

flydev
Member
From: France
Registered: 2020-11-27
Posts: 50
Website

Re: Server crash / NewRun event

mpv wrote:

Currently for Windows instances we always create a copy of a active log file before opening it in LogView to prevent such unexpected errors

Thanks you @mpv, that's what I ended doing smile

Last edited by flydev (2021-03-19 09:18:51)

Offline

Board footer

Powered by FluxBB