#1 2020-01-21 11:47:23

jaclas
Member
Registered: 2014-09-12
Posts: 215

PasZip has error in ZLIB compression algo?

I wrote simple test procedure:


uses PasZip;

procedure Test;
var
  i: Integer;
  j: Integer;
  InBuffer : array of AnsiChar;
  OutBuffer : array of AnsiChar;
  OutDecompress : array of AnsiChar;
  OutSize: Integer;
begin
  // init data and structures
  SetLength(InBuffer, 1000);
  SetLength(OutBuffer, 1000);
  SetLength(OutDecompress, 1000);
  //fill InBuffer = ('A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'A', 'B', 'C', ....etc.)
  for i := 0 to 99 do  for j := 0 to 9 do  InBuffer[i * 10 + j] := AnsiChar(65 + j);

  //compress InBuffer => OutBuffer
  OutSize := CompressMem(@InBuffer[0], @OutBuffer[0], Length(InBuffer), 1000);  // OutSize after compress has value 23

  //decompress OutBuffer  => OutDecompress
  OutSize := UnCompressMem(@OutBuffer[0], @OutDecompress[0], OutSize, 1000)
end;

After last line OutSize has value 1000 (properly)  BUT OutDecompress buffer is filled by zeroes!!! not by the original values ABCD...

What i'm doing wrong?

Offline

#2 2020-01-22 14:11:53

Eugene Ilyin
Member
From: milky_way/orion_arm/sun/earth
Registered: 2016-03-27
Posts: 132
Website

Re: PasZip has error in ZLIB compression algo?

@jaclas,

Seems like the UnCompressMem routine in PasZip has an issue and proceed only the first 2 bytes of the compressed data then exit with the success result.

The solution is simple: use SynZip unit based on zlib library instead of PasZip unit. It's recent, better managed, gives you control over compression level and let you choose the compression method (Deflate, GZip or ZLib).

You needn't to change anything in your code (parameters order and types are the same).

program DeflateTest;

{$APPTYPE CONSOLE}

uses
  SynZip,
//  PasZip, // Uncomment to check PasZip CompressMem
  SysUtils;

var
  Index, Size: Integer;
  Data, Compressed, Decompressed: array of AnsiChar;

begin
  SetLength(Data, 1000);
  SetLength(Compressed, 1000);
  SetLength(Decompressed, 1000);

  for Index := Low(Data) to High(Data) do
    Data[Index] := AnsiChar(Ord('A') + Index mod 10);

  Size := CompressMem(Data, Compressed, Length(Data), Length(Compressed));
  Writeln('Compressed size: ', Size);

  Size := UnCompressMem(Compressed, Decompressed, Size, Length(Decompressed));
  Writeln('Decompressed size: ', Size);
  Writeln('Equality: ', CompareMem(Data, Decompressed, Length(Data)));
  Readln;
end.

Don't set the compressed buffer size equals to the source data size.
You must add some extra bytes for the cases where the compressed sequence requires more bytes than the source data (it is possible when you try to compress high-entropy random data or better compressed data like synlz archives, png, pdf, docx, etc.). Additionally you have to reserve some extra bytes for the compression header overhead.

If you change AnsiChar(65 + j) to AnsiChar(Random(255)) you will see that compression failed because the compressed data requires 1005 bytes (in average case) as minimum (set compressed buffer to 2000 for example and check the actual compressed size).

The maximum possible memory required for (Gzip, Zlib or Deflate) compression is provided in SynZip.pas:5368 unit in CompressInternal:

procedure CompressInternal(var Data: ZipString; Compress, ZLib: boolean);
...
    DataLen: integer;
begin
...
  DataLen := length(Data);
...
    SetString(Data,nil,DataLen+256+DataLen shr 3); // max mem required

So 12.5% overhead with additional 256 bytes for the header (or Size + Size shr 3 + 256) is a good heuristic for the max size of any compressed data (it's includes your possible plans to switch from Deflate to GZip):

begin
  SetLength(Data, 1000);
  for Index := Low(Data) to High(Data) do
    Data[Index] := AnsiChar(Ord('A') + Index mod 10);

  Size := Length(Data);
  SetLength(Compressed, Size + Size shr 3 + 256);
  Size := CompressMem(Data, Compressed, Length(Data), Length(Compressed));
  Writeln('Compressed size: ', Size);

  SetLength(Decompressed, Length(Data));
  Size := UnCompressMem(Compressed, Decompressed, Size, Length(Decompressed));
  Writeln('Decompressed size: ', Size);
  Writeln('Equality: ', CompareMem(Data, Decompressed, Length(Data)));
  Readln;
end.

Last edited by Eugene Ilyin (2020-01-22 16:22:16)

Offline

Board footer

Powered by FluxBB