#1 2011-06-03 09:18:41

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,182
Website

Fast JSON parsing

When it deals with parsing some (textual) content, two directions are usually envisaged. In the XML world, you have usually to make a choice between:
- A DOM parser, which creates an in-memory tree structure of objects mapping the XML content;
- A SAX parser, which reads the XML content, then call pre-defined events for each XML content element.

In fact, DOM parsers use internally a SAX parser to read the XML content. Therefore, with the overhead of object creation and their property initialization, DOM parsers are typically three to five times slower than SAX. But, DOM parsers are much more powerful for handling the data: as soon as it's mapped in native objects, code can access with no time to any given node, whereas a SAX-based access will have to read again the whole XML content.

Most JSON parser available in Delphi use a DOM-like approach. For instance, the DBXJSON unit included since Delphi 2010 or the SuperObject library create a class instance mapping each JSON node.

In a JSON-based Client-Server ORM like ours, profiling shows that a lot of time is spent in JSON parsing, on both Client and Server side. Therefore, we tried to optimize this part of the library.

In order to achieve best speed, we try to use a mixed approach:
- All the necessary conversion (e.g. un-escape text) is made in-memory, from and within the JSON buffer, to avoid memory allocation;
- The parser returns pointers to the converted elements (just like the vtd-xml library).

In practice, here is how it is implemented:
- A private copy of the source JSON data is made internally (so that the Client-Side method used to retrieve this data can safely free all allocated memory);
- The source JSON data is parsed, and replaced by the UTF-8 text un-escaped content, in the same internal buffer (for example, strings are un-escaped and #0 are added at the end of any field value; and numerical values remains text-encoded in place, and will be extracted into Int64 or double only if needed);
- Since data is replaced in-memory (JSON data is a bit more verbose than pure UTF-8 text so we have enough space), no memory allocation is performed during the parsing: the whole process is very fast, not noticeably slower than a SAX approach;
- This very profiled code (using pointers and tuned code) results in a very fast parsing and conversion.

This parsing "magic" is done in the GetJSONField function, as defined in the SynCommons.pas unit:

/// decode a JSON field in an UTF-8 encoded buffer (used in TSQLTableJSON.Create)
// - this function decodes in the P^ buffer memory itself (no memory allocation
// or copy), for faster process - so take care that it's an unique string
// - PDest points to the next field to be decoded, or nil on any unexpected end
// - null is decoded as nil
// - '"strings"' are decoded as 'strings'
// - strings are JSON unescaped (and \u0123 is converted to UTF-8 chars)
// - any integer value is left as its ascii representation
// - wasString is set to true if the JSON value was a "string"
// - works for both field names or values (e.g. '"FieldName":' or 'Value,')
// - EndOfObject (if not nil) is set to the JSON value char (',' ':' or '}' e.g.)
function GetJSONField(P: PUTF8Char; out PDest: PUTF8Char;
  wasString: PBoolean=nil; EndOfObject: PUTF8Char=nil): PUTF8Char;

This function allows to iterate throughout the whole JSON buffer content, retrieving values or property names, and checking EndOfObject returning value to handle the JSON structure.

This in-place parsing of textual content is one of the main reason why we used UTF-8 (via RawUTF8) as the common string type in our framework, and not the generic string type, which would have introduced a memory allocation and a charset conversion.

Offline

#2 2011-06-03 09:21:46

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,182
Website

Re: Fast JSON parsing

You could refer to http://stackoverflow.com/questions/5304 … -in-delphi about JSON / XML parsing speed and concepts.



For instance, here is how JSON content is converted into SQL, as fast as possible:

function GetJSONObjectAsSQL(var P: PUTF8Char; const Fields: TRawUTF8DynArray;
  Update, InlinedParams: boolean): RawUTF8;
(...)
    // get "COL1"="VAL1" pairs, stopping at '}' or ']'
    FieldsCount := 0;
    repeat
      FU := GetJSONField(P,P);
      inc(Len,length(FU));
      if P=nil then break;
      Fields2[FieldsCount] := FU;
      Values[FieldsCount] := GetValue; // update EndOfObject
      inc(FieldsCount);
    until EndOfObject in [#0,'}',']'];
    Return(@Fields2,@Values,InlinedParams);
  (...)

And the sub-function GetValue makes use of GetJSONField also:

function GetValue: RawUTF8;
var wasString: boolean;
    res: PUTF8Char;
begin
  res := P;
  if (PInteger(res)^ and $DFDFDFDF=NULL_DF) and (res[4] in [#0,',','}',']'])  then
    /// GetJSONField('null') returns '' -> check here to make a diff with '""'
    result := 'null' else begin
    // any JSON string or number or 'false'/'true' in P:
    res := GetJSONField(res,P,@wasString,@EndOfObject);
    if wasString then
      if not InlinedParams and
         (PInteger(res)^ and $00ffffff=JSON_BASE64_MAGIC) then
        // \\uFFF0base64encodedbinary -> 'X''hexaencodedbinary'''
        // if not inlined, it can be used directly in INSERT/UPDATE statements
        result := Base64MagicToBlob(res+3) else
        { escape SQL strings, cf. the official SQLite3 documentation }
        result := QuotedStr(pointer(res),'''') else
      result := res;
  end;
  Inc(Len,length(result));
end;


This code will create a string for each key/value in Fields2[] and Values[] arrays, but only once, with the definitive value (even single quote escape and BLOB unserialize from Base-64 encoding are performed directly from the JSON buffer).

Offline

#3 2015-07-17 15:06:03

Cahaya
Member
Registered: 2015-06-21
Posts: 36

Re: Fast JSON parsing

Hi AB,

How to use GetJSONField ? Any example ? Any function to convert from JSON to DML SQL statement ?
I don't want ID field inside mormot RESTful.

An example, browser send JSON, '{"Customers:" [{"Customer_No":1,"Name":"Cahaya Harapan"}] }', and I want to change it into DML, something like "INSERT INTO Customers VALUES (?,?)" or "UPDATE Customers SET Name = ? WHERE CUSTOMER_NO = ?"

Thank you very much.

Offline

#4 2015-07-17 15:31:49

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,182
Website

Re: Fast JSON parsing

This is the purpose of the TJSONObjectDecoder record, as defined in mORMot.pas.

See http://synopse.info/files/html/api-1.18 … ECTDECODER

For your purpose, just use function GetJSONObjectAsSQL().

Offline

#5 2015-07-17 16:42:57

Cahaya
Member
Registered: 2015-06-21
Posts: 36

Re: Fast JSON parsing

Hi AB,

Thank you for your fast respond. By the way, is it possible to use mormot RESTful without ID field ? Or any option for it ?
If I'm not wrong, mormot will generate error if, table inside database does not have ID field.

Thank you.

Offline

Board footer

Powered by FluxBB