#1 2014-07-16 11:40:32

Bacchus
Member
Registered: 2014-07-16
Posts: 5

Bug in SynCrtSock: TWinHTTP.InternalReadData, string or bytearray

I am using the latest ("trunk") SynCrtSock in a Delphi 6 application (client requirement, don't ask..) to make a connection to a JBoss-webservice. Some of this works but when I receive deflated data (i.e. CompressDeflate is registered) this fails with a "-3" error code, this seems to mean "invalid compressed data".
Some research reveals that in this case the compressed data contains a null-character (ascii-value = 0) that terminates a string somewhere in the middle. It seems to me that is is incorrect to use the datatype string to contain possibly binary data. This should be some kind of a bytearray (as is returned by winhttpreaddata).
I have not yet confirmed that using a bytearray instead of string works but will report if it does. Since I'm working on it on my client's time, I will not be able to post the code for the fix because of possible copyright-infringement.

Last edited by Bacchus (2014-07-16 11:41:55)

Offline

#2 2014-07-16 12:52:40

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,655
Website

Re: Bug in SynCrtSock: TWinHTTP.InternalReadData, string or bytearray

RawByteString just ignore any #0 in the middle of the string. It will use the string length stored as prefix.
I guess your issue is somewhere else.

We did not have problems with deflate here.

Have you code to reproduce the issue?

Online

#3 2014-07-16 13:45:57

Bacchus
Member
Registered: 2014-07-16
Posts: 5

Re: Bug in SynCrtSock: TWinHTTP.InternalReadData, string or bytearray

I can reproduce it with this:

program mORMot_Issue;

{$APPTYPE CONSOLE}

uses
  SysUtils, SynZip;

const
{
  Data as captured in Wireshark:
  0000   78 9c 35 cc 31 0a 80 30 0c 05 d0 dd 53 c4 1e 40  x.5.1..0....S..@
  0010   71 af 05 51 6f e0 56 1c 04 3f d2 c1 b4 24 51 f0  q..Qo.V..?...$Q.
  0020   f6 e2 e0 fe 78 5e a0 25 b3 82 6e 88 a6 cc bd eb  ....x^.%..n.....
  0030   1c a9 6d 76 69 ef 66 91 2c 2e f8 3a 8e d3 b0 0c  ..mvi.f.,..:....
  0040   71 16 4a 4a 00 53 62 83 30 48 1f 35 e0 24 7c 92  q.JJ.Sb.0H.5.$|.
  0050   72 39 60 82 1d dc ac 6b f0 ed 9f 87 ea 05 7a 0a  r9`....k......z.
  0060   24 aa                                            $.

  Decoded by Wireshark:
  <response version="1" status="Error"><![CDATA[Er is een interne systeem error opgetreden.]]></response>
}
  Data: RawByteString =
    #$78#$9c#$35#$cc#$31#$0a#$80#$30#$0c#$05#$d0#$dd#$53#$c4#$1e#$40 +
    #$71#$af#$05#$51#$6f#$e0#$56#$1c#$04#$3f#$d2#$c1#$b4#$24#$51#$f0 +
    #$f6#$e2#$e0#$fe#$78#$5e#$a0#$25#$b3#$82#$6e#$88#$a6#$cc#$bd#$eb +
    #$1c#$a9#$6d#$76#$69#$ef#$66#$91#$2c#$2e#$f8#$3a#$8e#$d3#$b0#$0c +
    #$71#$16#$4a#$4a#$00#$53#$62#$83#$30#$48#$1f#$35#$e0#$24#$7c#$92 +
    #$72#$39#$60#$82#$1d#$dc#$ac#$6b#$f0#$ed#$9f#$87#$ea#$05#$7a#$0a +
    #$24#$aa;

var
  EncodedData: RawByteString;
  DecodedData: RawByteString;
begin
  EncodedData := Data;
  DecodedData := CompressDeflate(EncodedData, False);
  Writeln(DecodedData);
  Readln;
end.

My code (that is not the code in the quote here) basically does this:
fWinHttp := TWinHTTPCreate(hostname, 0, false);
fWinHttp.RegisterCompress(CompressDeflate);
and then a handful of Request()s: a GET with a redirect, a second GET with a plain answer, a POST with a redirect and then the GET from the redirect with a deflated response. The last response is the only one using deflate and the thus the only (or, I'm afraid: "the first" smile ) one failing.

Offline

#4 2014-07-17 07:00:27

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,655
Website

Re: Bug in SynCrtSock: TWinHTTP.InternalReadData, string or bytearray

Are you sure your buffer is a real "deflate" content?

After some (long) investigation on our side, we found out that:

When I use the regular ZLib library, as shipped with Delphi, we have got the same error:

uses
  SysUtils,
  Classes,
  ZLib;

const
  MAX_WBITS   = 15; // 32K LZ77 window
  DEF_MEM_LEVEL = 8;

function Check(const aCode: Integer; const ValidCodes: array of Integer): integer;
var i: Integer;
begin
  if aCode=Z_MEM_ERROR then
    OutOfMemoryError;
  result := acode;
  for i := Low(ValidCodes) to High(ValidCodes) do
    if ValidCodes[i]=aCode then
      Exit;
  raise Exception.CreateFmt('Error %d during zip/deflate process',[aCode]);
end;

procedure CompressInternal(var Data: RawByteString; Compress: boolean; Bits: integer);
var strm: TZStreamRec;
    code, len: integer;
    tmp: RawByteString;
begin
  fillchar(strm,sizeof(strm),0);
  strm.next_in := pointer(Data);
  strm.avail_in := length(Data);
  if Compress then begin
    SetString(tmp,nil,strm.avail_in+256+strm.avail_in shr 3); // max mem required
    strm.next_out := pointer(tmp);
    strm.avail_out := length(tmp);
    // +MAX_WBITS below = encode in deflate format
    // Z_HUFFMAN_ONLY instead of Z_DEFAULT_STRATEGY is slowest and bad
    if deflateInit2_(strm, 1, Z_DEFLATED, Bits, DEF_MEM_LEVEL,
      Z_DEFAULT_STRATEGY, ZLIB_VERSION, sizeof(strm))>=0 then
    try
      Check(deflate(strm,Z_FINISH),[Z_STREAM_END]);
    finally
      deflateEnd(strm);
    end;
  end else begin
    len := (strm.avail_in*20)shr 3; // initial chunk size = comp. ratio of 60%
    SetString(tmp,nil,len);
    strm.next_out := pointer(tmp);
    strm.avail_out := len;
    if inflateInit2_(strm, bits, ZLIB_VERSION, sizeof(strm))>=0 then
    try                
      repeat
        code := Check(inflate(strm, Z_FINISH),[Z_OK,Z_STREAM_END,Z_BUF_ERROR]);
        if strm.avail_out=0 then begin
          // need to increase buffer by chunk
          SetLength(tmp,length(tmp)+len);
          strm.next_out := pointer(PAnsiChar(pointer(tmp))+length(tmp)-len);
          strm.avail_out := len;
        end;
      until code=Z_STREAM_END;
    finally
      inflateEnd(strm);
    end;
  end;
  SetString(Data,PAnsiChar(pointer(tmp)),strm.total_out);
end;

function CompressDeflate(var Data: RawByteString; Compress: boolean): RawByteString;
begin
  CompressInternal(Data,Compress,-MAX_WBITS);
  result := 'deflate';
end;

const
  Data: RawByteString =
    #$78#$9c#$35#$cc#$31#$0a#$80#$30#$0c#$05#$d0#$dd#$53#$c4#$1e#$40 +
    #$71#$af#$05#$51#$6f#$e0#$56#$1c#$04#$3f#$d2#$c1#$b4#$24#$51#$f0 +
    #$f6#$e2#$e0#$fe#$78#$5e#$a0#$25#$b3#$82#$6e#$88#$a6#$cc#$bd#$eb +
    #$1c#$a9#$6d#$76#$69#$ef#$66#$91#$2c#$2e#$f8#$3a#$8e#$d3#$b0#$0c +
    #$71#$16#$4a#$4a#$00#$53#$62#$83#$30#$48#$1f#$35#$e0#$24#$7c#$92 +
    #$72#$39#$60#$82#$1d#$dc#$ac#$6b#$f0#$ed#$9f#$87#$ea#$05#$7a#$0a +
    #$24#$aa;

procedure Test;
var s: RawByteString;
begin
  s := 'test 123456 ';
//  s := s+s;s := s+s;s := s+s;s := s+s;s := s+s;s := s+s;
  CompressDeflate(s,true);
  CompressDeflate(s,false);
  writeln(s);
  s := Data;
  CompressDeflate(s,false);
  writeln(s);
  readln;
end;

begin
  Test;
end.

May be the code in CompressInternal() is incorrect - but it works with browsers AFAIK.
If we do CompressDeflate(,true) + CompressDeflate(,false) it works and browsers accept the content as deflate.

I discovered that your content is NOT deflate-encoded, but zlib-encoded.

If you use CompressZLib() instead of CompressDeflate(), it works as expected...
WireShark is able to identify the content as ZLib encoded.

In fact, org.jboss.netty.util.internal.jzlib class is defined as such:

       return deflateInit(level, JZlib.MAX_WBITS);

So it creates a ZLIB content, not DEFLATE.

To create true DEFLATE content, it should state:

       return deflateInit(level, -JZlib.MAX_WBITS);

See http://www.zlib.net/manual.html :

windowBits can also be –8..–15 for raw deflate. In this case, -windowBits determines the window size. deflate() will then generate raw deflate data with no zlib header or trailer, and will not compute an adler32 check value.

Sounds like a bug in your version of JBoss, or a misconception on your side of deflate / zlib encodings.

Online

#5 2014-07-17 08:14:12

Bacchus
Member
Registered: 2014-07-16
Posts: 5

Re: Bug in SynCrtSock: TWinHTTP.InternalReadData, string or bytearray

ab wrote:

Are you sure your buffer is a real "deflate" content?

After some (long) investigation on our side, we found out that:

When I use the regular ZLib library, as shipped with Delphi, we have got the same error:
[...]

Sounds like a bug in your version of JBoss, or a misconception on your side of deflate / zlib encodings.

Thank you for your great effort in helping me and my apologies for accusing your software of having a bug. Since Synopse/WinHttp is the "new" part in the chain, replacing Indy, I incorrectly supposed the problem must be in the new part. I tried interpreting the data as gzip but must have made another mistake there.

Last edited by Bacchus (2014-07-17 08:14:49)

Offline

#6 2014-07-17 13:24:04

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,655
Website

Re: Bug in SynCrtSock: TWinHTTP.InternalReadData, string or bytearray

If you enable gzip compression, there should not be any problem.

There is often a confusion between deflate and zlib content encoding, on both client (i.e. browsers are nor consistent) or servers (like JBoss, as it appeared here).

With gzip content encoding, you should not have any problem.

Online

#7 2014-07-18 11:54:05

Bacchus
Member
Registered: 2014-07-16
Posts: 5

Re: Bug in SynCrtSock: TWinHTTP.InternalReadData, string or bytearray

..unless the server incorrectly specifies "Content-Encoding: deflate" for the gzip-encoded data. It's not actually a JBoss-issue btw but some custom software I'll have to fight our Java-dev about. smile SynCrtSock imho rightfully assumes the header is accurate about the datatype but it clearly isn't.

Offline

#8 2014-07-18 13:06:55

Bacchus
Member
Registered: 2014-07-16
Posts: 5

Re: Bug in SynCrtSock: TWinHTTP.InternalReadData, string or bytearray

Still I have something that bugs me, though I'm not sure if it's a true error somewhere:
The server specifies a header "Content-Encoding: deflate" but sends zlib-data. That is: I can successfully decode it with Synopse's CompressZLib. In the final application I have been able to implement that by hacking "result := 'deflate';" in CompressZLib. On the other hand I cannot find a spec for HTTP-headers that speaks of a zlib-content-encoding. So how should I know if data is zlib- or deflate-encoded?

So far I have a working application but there must be a "proper" way to distinguish between those, without hacking into SynZip?

Offline

#9 2014-07-18 13:29:17

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,655
Website

Re: Bug in SynCrtSock: TWinHTTP.InternalReadData, string or bytearray

I found this:

wikipedia wrote:

Another problem found while deploying HTTP compression on large scale is due to the deflate encoding definition: while HTTP 1.1 defines the deflate encoding as data compressed with deflate (RFC 1951) inside a zlib formatted stream (RFC 1950), Microsoft server and client products historically implemented it as a "raw" deflated stream,[16] making its deployment unreliable.[17][18] For this reason, some software, including the Apache HTTP Server, only implement gzip encoding.
Source: http://en.wikipedia.org/wiki/HTTP_compression

In fact, deflate/zip are implemented differently depending on the browser...
Even Mark Adler states this in http://stackoverflow.com/a/9186091/458259

So IMHO you should just stick with gzip, if you need some cross-platform encoding success...

I've changed to default HTTP compression to use gzip instead of deflate/zlib for our mORMot HTTP client and server units...
See http://synopse.info/fossil/info/d15be450aae1

Online

#10 2014-07-21 20:42:44

mpv
Member
From: Ukraine
Registered: 2012-03-24
Posts: 1,571
Website

Re: Bug in SynCrtSock: TWinHTTP.InternalReadData, string or bytearray

For shure - I newer see deflate in production. For example nginx mod_deflate actually add gzip compression and  deflate compression totally disabled because of browser and proxy problems

Online

#11 2014-07-21 21:31:36

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,655
Website

Re: Bug in SynCrtSock: TWinHTTP.InternalReadData, string or bytearray

Yes gzip is our new default.

Online

Board footer

Powered by FluxBB