#1 2022-08-04 17:22:47

ma64
Member
Registered: 2011-01-27
Posts: 12

Small addition to get the location of an error in a JSON string

Hello Arnaud,

I’ve been recently faced with the task to show the first error in an invalid JSON string and I came up with the following solution.

  1. In mormot.core.json I put a “var” in front of the parameter of the function TJsonGotoEndParser.GotoEnd() so that the index of the character that triggered the parser error gets returned:

    function TJsonGotoEndParser.GotoEnd(var P: PUtf8Char): PUtf8Char;
  2. Again in mormot.core.json I added an overloaded version of the function IsValidJsonBuffer() that returns the index of the erroneous JSON character in the third parameter:

    function IsValidJsonBuffer(P: PUtf8Char; strict: boolean; var ErrorPosition: uint64): boolean;
    var
      parser: TJsonGotoEndParser;
      pstart: putf8char;
    begin
      {%H-}parser.Init(strict, nil);
      pstart := P;
      result := parser.GotoEnd(P) <> nil;
      if not result then
        ErrorPosition := P - pstart;
    end;
  3. These small changes made it possible to write the following code that shows a small portion of the JSON surrounding the erroneous character as well as a marker pointing at that character:

    class function TJsonValidator.ValidateFile(const JsonFileName: string): boolean;
    var
      content: putf8char;
      mapping: THandle;
      fileSize: uint64;
      res: boolean;
      errPos, excStartPos, excEndPos: uint64;
      excerpt, marker: utf8string;
    begin
      content := MapFile(CommandLineParams.JsonFile, mapping, fileSize);
      result := IsValidJsonBuffer(pointer(content), true, errPos);
      WriteLn('');
      if result then
        WriteLn('File contains valid JSON.')
      else begin
        WriteLn('File contains invalid JSON:');
        excStartPos := Max(errPos - 20, 0);
        excEndPos := Min(errPos + 20, fileSize);
        SetLength(excerpt, excEndPos - excStartPos + 1);
        if Length(excerpt) > 0 then begin
          WriteLn('');
          Move((content + excStartPos)^, excerpt[1], Length(excerpt));
          WriteLn(Utf8ToString(excerpt));
          marker := StringOfChar(ansichar(' '), Length(excerpt));
          marker[errPos - excStartPos + 1] := '^';
          WriteLn(Utf8ToString(marker));
        end;
      end;
    end;
    
    class function TJsonValidator.MapFile(const FileName: string; out Mapping: THandle; out Size: uint64): putf8char;
    var
      fileHandle: THandle;
      lFileSize: LARGE_INTEGER;
      content: putf8char;
    begin
      if not FileExists(FileName) then
        raise Exception.Create('File not found');
      fileHandle := FileOpen(FileName, fmOpenRead or fmShareDenyWrite);
      Win32Check(fileHandle <> 0);
      try
        lFileSize.LowPart := GetFileSize(fileHandle, @lFileSize.HighPart);
        Size := puint64(@lFileSize)^;
        if Size = 0 then
          raise Exception.Create('File is empty');
        Mapping := CreateFileMapping(fileHandle, nil, PAGE_READONLY, 0, 0, nil);
        Win32Check(Mapping <> 0);
      finally
        FileClose(fileHandle);
      end;
      result := MapViewOfFile(Mapping, FILE_MAP_READ, 0, 0, 0);
      Win32Check(result <> nil);
    end;
    
    class procedure TJsonValidator.UnmapFile(Mapping: THandle; Buffer: putf8char);
    begin
      UnmapViewOfFile(Buffer);
      CloseHandle(Mapping);
    end;

Do you think this would be a helpful addition to mORMot?

Offline

#2 2022-08-04 18:28:50

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,661
Website

Re: Small addition to get the location of an error in a JSON string

1) Putting a "var P" as parameter reduces the performance of the function in a noticeable manner.
Since this function is used in several very important places in the framework, I can't make this modification.
Adding an LastReadPosition: PUtf8Char field in TJsonGotoEndParser may be a better option, filling it e.g. when jtComma is reached. It would give you an approximation of the error position, with no performance penalty.

2) Using a mapping file is not a good idea here.
First of all, it is not good about performance. Reading the files in chunks is always faster because it makes less CPU context switches.
It induces a dependency to Windows. (note that there is something cross-platform in mormot.core.os)
And since you use the version with P with no size, it will parse everything until a #0 is found, which may be after the memory map page size, so it would trigger a GPF randomly...

So I guess another way should be found for your issue.

Offline

#3 2022-08-05 09:59:04

ma64
Member
Registered: 2011-01-27
Posts: 12

Re: Small addition to get the location of an error in a JSON string

Ok, thanks. I will try to find a better solution based on your helpful comments.

Offline

Board footer

Powered by FluxBB