You are not logged in.
Pages: 1
Hello ab.
I need to develop a routine to read huge files (47GB).
This routine must process line by line.
I made using ReadLn. But it was very slow and moreover can not read special characters on the line
So I am trying to make such a routine:
const
BUFF_SIZE = $8000;
var
datafile : array [0..BUFF_SIZE-1] of ansichar;
myEOF : Boolean;
v_string,
v_result : string;
hFile : THandle;
dwread :LongWord;
I : Integer;
if OpenDialog1.Execute then
begin
hFile := CreateFile(PChar(OpenDialog1.FileName),
GENERIC_READ,
FILE_SHARE_READ or FILE_SHARE_WRITE,
nil,
OPEN_EXISTING,
FILE_ATTRIBUTE_READONLY,
0);
SetFilePointer(hFile, 0, nil, FILE_BEGIN);
while (dwread > 0) and (not myEOF) do
begin
Readfile(hFile, datafile, BUFF_SIZE, dwread, nil);
if dwread <> BUFF_SIZE then
myEOF := true;
v_string := datafile;
SetLength(v_result,length(v_string));
asm
MOV EDI,v_string
MOV EAX,13
MOV ECX,dwread
CLD
@again:
REPNE SCASB
OR ECX,ECX
JZ @done
INC I
JMP @again
@done:
end;}
end;
end;
end;
closehandle(hFile);
This code is working well and is fast. It is counting the number of lines that the file has.
Now I need a help to change the routine in assembler. The idea is to separate the rows to be
processsas individually. Like that:
asm
MOV EDI,v_result
MOV ESI,v_string
MOV EAX,13
MOV ECX,dwread
CLD
@again:
MOV AL, [ESI]
INC ESI
CMP AL,13
JL @process
OR ECX,ECX
JZ @done
MOV [EDI],AL
INC EDI
JMP @again
@process:
......
JMP @again
@done:
end;
This code is not working.
Can you help me?
Offline
I think that you need memory mapped streams.
See http://synopse.info/files/html/api-1.18 … REAMMAPPED
There are also a TMemoryMapText class, "much faster than TStringList.LoadFromFile()", in SynCommons.pas
Which is better, @ab?
Last edited by Junior/RO (2015-07-16 05:32:43)
Offline
Thanks for your return.
But what happens is that my file has 42GB. When calling the "v_synfile := TSynMemoryStreamMapped.Create(OpenDialog1.FileName);", gives error.
In the function "function TMemoryMap.Map(aFile: THandle; aCustomSize: cardinal; aCustomOffset: Int64): boolean;" has
if (fFileSize <= 0) or (fFileSize> maxint) then
/// Maxint = $ 7FFFFFFF = 1,999 GB (2GB would induce errors PtrInt)
Offline
Yes, we map the whole file in memory, so it won't work for your purpose.
At least for a 32 bit app.
Just read the file in memory chunks, then process them.
If you read the file from start to begin, using memory mapped files won't help: plain TFileStream.Read() in a memory buffer would be more efficient.
Offline
Pages: 1