#1 2015-07-16 00:14:48

Celso
Member
Registered: 2013-05-14
Posts: 55

Fast read a big file

Hello ab.

I need to develop a routine to read huge files (47GB).
This routine must process line by line.
I made using ReadLn. But it was very slow and moreover can not read special characters on the line

So I am trying to make such a routine:

const
  BUFF_SIZE = $8000;

var
   datafile    : array [0..BUFF_SIZE-1] of ansichar;

   myEOF       : Boolean;

   v_string,
   v_result    : string;

   hFile       : THandle;

   dwread      :LongWord;

   I           : Integer;


if OpenDialog1.Execute then
begin
   hFile := CreateFile(PChar(OpenDialog1.FileName),
                       GENERIC_READ,
                       FILE_SHARE_READ or FILE_SHARE_WRITE,
                       nil,
                       OPEN_EXISTING,
                       FILE_ATTRIBUTE_READONLY,
                       0);
   SetFilePointer(hFile, 0, nil, FILE_BEGIN);

   while (dwread > 0) and (not myEOF)  do
   begin
      Readfile(hFile, datafile, BUFF_SIZE, dwread, nil);
      if dwread <> BUFF_SIZE then
         myEOF := true;

      v_string := datafile;
      SetLength(v_result,length(v_string));

      asm
         MOV EDI,v_string
         MOV EAX,13
         MOV ECX,dwread
         CLD         
         @again:
            REPNE SCASB
            OR ECX,ECX
            JZ @done   
            INC I     
            JMP @again
         @done:
      end;}
      end;
   end;
end;
closehandle(hFile);

This code is working well and is fast. It is counting the number of lines that the file has.

Now I need a help to change the routine in assembler. The idea is to separate the rows to be
processsas individually. Like that:

asm
   MOV EDI,v_result
   MOV ESI,v_string
   MOV EAX,13
   MOV ECX,dwread
   CLD
         
   @again:
      MOV AL, [ESI]
      INC ESI
      CMP AL,13
      JL @process
      OR ECX,ECX
      JZ @done
      MOV [EDI],AL
      INC EDI
      JMP @again

   @process:
    ......
    JMP @again

   @done:
end;

This code is not working.
Can you help me?

Offline

#2 2015-07-16 05:29:04

Junior/RO
Member
Registered: 2011-05-13
Posts: 207

Re: Fast read a big file

I think that you need memory mapped streams.

See http://synopse.info/files/html/api-1.18 … REAMMAPPED

There are also a TMemoryMapText class, "much faster than TStringList.LoadFromFile()", in SynCommons.pas

Which is better, @ab?

Last edited by Junior/RO (2015-07-16 05:32:43)

Offline

#3 2015-07-16 12:17:14

Celso
Member
Registered: 2013-05-14
Posts: 55

Re: Fast read a big file

Thanks for your return.

But what happens is that my file has 42GB. When calling the "v_synfile := TSynMemoryStreamMapped.Create(OpenDialog1.FileName);", gives error.

In the function "function TMemoryMap.Map(aFile: THandle; aCustomSize: cardinal; aCustomOffset: Int64): boolean;" has

   if (fFileSize <= 0) or (fFileSize> maxint) then
     /// Maxint = $ 7FFFFFFF = 1,999 GB (2GB would induce errors PtrInt)

Offline

#4 2015-07-16 15:19:50

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,272
Website

Re: Fast read a big file

Yes, we map the whole file in memory, so it won't work for your purpose.
At least for a 32 bit app.

Just read the file in memory chunks, then process them.
If you read the file from start to begin, using memory mapped files won't help: plain TFileStream.Read() in a memory buffer would be more efficient.

Offline

Board footer

Powered by FluxBB