You are not logged in.
Pages: 1
Hi, ab
I'm working on a project extracting zipped files online and found two issues about TZipRead.
1. TZipRead demonds more too big WorkMem to extract content files when fileinfo is stored in DataDiscriptor.
I must assign WorkingMem to filesize to make it run, even the half will fail.
TZipRead.Create(BufZip: PByteArray; Size: PtrInt; Offset: Int64);
// ...
if e^.localoffs >= Offset then
begin
// can unzip directly from existing memory buffer
e^.local := @BufZip[Int64(e^.localoffs) - Offset];
with e^.local^.fileInfo do
if flags and FLAG_DATADESCRIPTOR <> 0 then
// crc+sizes in "data descriptor" -> call RetrieveFileInfo()
if (zcrc32 <> 0) or
(zzipSize <> 0) or
(zfullSize <> 0) then
raise ESynZip.CreateUtf8('%.Create: data descriptor (MacOS) with ' +
'sizes for % %', [self, e^.zipName, fFileName]);
// ...
In constructor, BuffZip must contain even the first local info to setup Entry.local, else we must call RetrieveFileInfo to get local.
function TZipRead.RetrieveFileInfo(Index: integer;
out Info: TFileInfoFull): boolean;
// ...
if e^.local = nil then
begin
local.DataSeek(fSource, e^.localoffs + fSourceOffset);
if local.fileInfo.flags and FLAG_DATADESCRIPTOR <> 0 then
raise ESynZip.CreateUtf8('%: increase WorkingMem for data descriptor ' +
'(MacOS) support on % %', [self, e^.zipName, fFileName]);
Info.localfileheadersize := local.Size;
end
else
begin
Info.localfileheadersize := e^.local^.Size;
if e^.local^.fileInfo.flags and FLAG_DATADESCRIPTOR <> 0 then
// ...
But in RetrieveFileInfo() Exception will be raised because Entry.local equals nil!
Maybe we should try to setup Entry.local first because we just skipped this step in the constructor?
2.Sometimes charset of filename is not setup correctly in zip files, in that case TZipRead.NameToIndex will not work well.
Can I specify a default encoding type when I open a file, and use this default encoding type (such as UTF8) instead of the OemToFileName when ansi7 detection fails?
Best regards.
Last edited by uian2000 (2021-12-09 05:45:22)
Offline
1. This is because the ZIP was created on Mac, I guess.
There is a limitation about those files.
Can you propose a pull request?
2. We had a lot of discussion and some fixes about charset of filenames.
It appears that it is not very well defined, and sometimes some files do not follow the APPNOTE.
Check e.g. https://synopse.info/forum/viewtopic.php?id=6052
What do you propose?
Offline
1.I'll do some test and make a pr if I could fix it.
2.For TZipRead only, I think add a FaverEncode param in constructor might be a good option.
Most of times, one zip file is built with one single Charset, so let the user fix unstanderd files dose make sense.
Offline
I've made a pr https://github.com/synopse/mORMot2/pull/69, see if it works.
regards.
Offline
Please try https://github.com/synopse/mORMot2/comm … 2f64b38f1c
I tried to reduce the memory consumption.
The pull request read the whole file into memory just to read the last few bytes, which may be resource consuming.
Offline
I didn't find a good size to reduce mem, I'll try this one.
Thanks ab.
Offline
Your feedback issues should be fixed by https://github.com/synopse/mORMot2/commit/316f3c01
Offline
Hi, ab.
Thanks for your fix, that's efficent and do works for me.
I've digging a new issue.
According to TZipRead.Create(Buf...), directory is not count as Entry.
constructor TZipRead.Create(BufZip: PByteArray; Size: PtrInt; Offset: Int64);
...
if P[-1] = fZipNamePathDelim then
begin
h := hnext;
continue; // ignore void folder entry
end;
...
But, when we need to search a data descriptor before a directory, the result will be descriptor of this directory not that file.
[local file header n] (file n) <-- Entry[n].localoff
[zipped file data n]
[data descriptor n]
[local file header n+1] (directory after target file)
[zipped file data n+1]
[data descriptor n+1]
[local file header n+2] (file under nearby directory) <-- Entry[n+1].localoff
[zipped file data n+2]
[data descriptor n+2]
In this case RetrieveFileInfo will return false.
Regards
Last edited by uian2000 (2021-12-29 11:38:21)
Offline
I thought a directory has no data, so no descriptor.
Anyway, I have tried to fix MacOS / DataDescriptor ZIP with folders in https://github.com/synopse/mORMot2/commit/c019517d
Offline
I have tried this commit, and it truely worked.
Thanks for your great work!
Last edited by uian2000 (2022-01-02 10:14:57)
Offline
Pages: 1