#1 2020-02-25 03:19:32

Eugene Ilyin
Member
From: milky_way/orion_arm/sun/earth
Registered: 2016-03-27
Posts: 132
Website

Variant from a JSON UTF-8 text as per RFC 8259, RFC 7159, RFC 7158

Hi,
From mORMot documentation:

With _Json() or _JsonFmt(), either a document or array variant instance will be initialized with data supplied as JSON.
The supplied JSON can be either in strict JSON syntax.

Right now mORMot partially support JSON from the obsolete RFC 4627 (created 13 years ago):

JSON-text = object / array

Is it any plans to support the actual RFC 8259 (created 2 years ago) or RFC 7159 (created 6 years ago):

  JSON-text = ws value ws

  value = false / null / true / object / array / number / string

  false = %x66.61.6c.73.65   ; false

  null  = %x6e.75.6c.6c      ; null

  true  = %x74.72.75.65      ; true

The changes are not radical and extends the current implementation with full backward compatibility (may be with except for 'null', which is parsed as '{}').

What is needed is to make the next JSON checks to be true:

TVarData(_Json('null')).VType = varNull

TVarData(_Json('false')).VType = varBoolean
TVarData(_Json('true')).VType = varBoolean

VarIsStr(_Json('"text"')) = True

VarIsFloat(_Json('0.5')) = True // Currency by default (why not to make all floats to be Double?)
VarIsFloat(_Json('-1E-10')) = True // Single precision
VarIsFloat(_Json('-1e-300')) = True // Double precision
VarIsFloat(_Json('-1E-010')) = True // Single with exponent started from 0
VarIsFloat(_Json('-1e-0300')) = True // Double with exponent started from 0

VarIsOrdinal(_Json('0')) = True // Integer
VarIsOrdinal(_Json('5000000000')) = True // Int64
VarIsOrdinal(_Json('-0')) = True // Integer
VarIsOrdinal(_Json('-5000000000')) = True // Int64

Browsers have no issues with:

console.log(JSON.parse('null'), JSON.parse('false'), JSON.parse('"text"'), JSON.parse('-1E-300'));

Btw, _Json('{"n": 1E3}') or _Json('{"n": -1e-0300}') is a correct RFC 4627 strict JSON, but mORMot parse such correct numbers as string:

  number = [ minus ] int [ frac ] [ exp ]
  decimal-point = %x2E       ; .
  digit1-9 = %x31-39         ; 1-9
  e = %x65 / %x45            ; e E
  exp = e [ minus / plus ] 1*DIGIT
  frac = decimal-point 1*DIGIT
  int = zero / ( digit1-9 *DIGIT )
  minus = %x2D               ; -
  plus = %x2B                ; +
  zero = %x30                ; 0

Last edited by Eugene Ilyin (2020-02-26 16:19:52)

Offline

#2 2020-02-25 03:39:21

Eugene Ilyin
Member
From: milky_way/orion_arm/sun/earth
Registered: 2016-03-27
Posts: 132
Website

Re: Variant from a JSON UTF-8 text as per RFC 8259, RFC 7159, RFC 7158

Btw, from JSON.org:

json
  element

element
  ws value ws

value
  object
  array
  string
  number
  "true"
  "false"
  "null"

And they also support standart exponent notation for numbers, which I am getting frequently from different 3rd party APIs:

Last edited by Eugene Ilyin (2020-02-25 17:29:17)

Offline

#3 2020-02-25 06:45:49

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,718
Website

Re: Variant from a JSON UTF-8 text as per RFC 8259, RFC 7159, RFC 7158

1. You are making a small confusion about which function to use.
The _Json() function is not for parsing any JSON item. It is to create a TDocVariant document, i.e. by definition it accepts only a JSON object or array as input.
See https://synopse.info/files/html/api-1.1 … html#_JSON
To parse any JSON item, use GetVariantFromJSON/GetVariantFromNotStringJSON/GetNumericVariantFromJSON

2. By default, we even don't accept "double" values, since they are likely to loose precision, so we prefer "currency" if possible.
There is a flag in various functions to accept the "double" values.
See for instance https://synopse.info/files/html/Synopse … l#TITL_194
If you want to keep the exact precision of JSON numbers, we offer TDecimal128 values - BTW we are the only one Delphi/FPC JSON library handling this kind of precision-aware floats.

Ensure you read https://synopse.info/files/html/Synopse … l#TITL_123

Offline

#4 2020-02-25 13:56:48

Eugene Ilyin
Member
From: milky_way/orion_arm/sun/earth
Registered: 2016-03-27
Posts: 132
Website

Re: Variant from a JSON UTF-8 text as per RFC 8259, RFC 7159, RFC 7158

ab, thanks for the hints!

After sources analysis of provided functions, I think I found how to handle RFC 8259 JSONs received from the 3rd party vendors API with mORMot (with respect to all peculiarities related to Double values).

Please find the code in the next topic reply.

Here are the results of _JsonStrict, _JsonStrictFast, _JsonStrictInPlace:

var
  V: Variant;
begin
  V := _JsonStrict('null');           // TDocVariantData(V).VarType = varNull
  V := _JsonStrict('false');          // TDocVariantData(V).VarType = varBoolean
  V := _JsonStrict('true');           // TDocVariantData(V).VarType = varBoolean
  V := _JsonStrict('-0');             // TDocVariantData(V).VarType = varInteger
  V := _JsonStrict('"t1 \r\n t2"');   // TDocVariantData(V).VarType = varString
  V := _JsonStrict('-1E-300');        // TDocVariantData(V).VarType = varDouble
  V := _JsonStrict('{}');             // TDocVariantData(V).Kind = dvObject
  V := _JsonStrict('[]');             // TDocVariantData(V).Kind = dvArray
  V := _JsonStrict('');               // TDocVariantData(V).VarType = varEmpty
  V := _JsonStrict('$%#@');           // TDocVariantData(V).VarType = varNull
  V := _JsonStrict('{"n": -1E-300}'); // TVarData(V._(0)).VType = varDouble
  V := _JsonStrict('9223372036854775807'); // TDocVariantData(V).VarType = varInt64
  V := _JsonStrict('9223372036854775808'); // TDocVariantData(V).VarType = varDouble
end;

Maybe you can add this or similar functions to the Framework?

Last edited by Eugene Ilyin (2020-02-25 17:28:16)

Offline

#5 2020-02-25 17:11:44

Eugene Ilyin
Member
From: milky_way/orion_arm/sun/earth
Registered: 2016-03-27
Posts: 132
Website

Re: Variant from a JSON UTF-8 text as per RFC 8259, RFC 7159, RFC 7158

Tried to make it more mORMot`ish smile and remove all intermediate procs (now this code contains the fastest available calls). Please confirm.

interface

/// retrieve a variant value from a JSON as per RFC 8259, RFC 7159, RFC 7158
// - follows TTextWriter.AddVariant() format (calls GetVariantFromJSON)
// - will instantiate either an Integer, Int64, currency, double or string value
// (as RawUTF8), guessing the best numeric type according to the textual content,
// and string in all other cases, except TryCustomVariants points to some options
// (e.g. @JSON_OPTIONS[true] for fast instance) and input is a known object or
// array, either encoded as strict-JSON (i.e. {..} or [..]), or with some
// extended (e.g. BSON) syntax
// - warning: by default dvoAllowDoubleValue is set and 32-bit floating-point
// conversion is tried, with potential loss of precision during the conversion
// - warning: the JSON buffer will be modified in-place during process - use
// a temporary copy or the overloaded functions with RawUTF8 parameter
// if you need to access it later
function _JsonStrictInPlace(const JSON: PUTF8Char;
  Options: TDocVariantOptions = [dvoReturnNullForUnknownProperty];
  const AllowDouble: Boolean = True): Variant;
  {$ifdef HASINLINE}inline;{$endif}

/// retrieve a variant value from a JSON as per RFC 8259, RFC 7159, RFC 7158
// - follows TTextWriter.AddVariant() format (calls GetVariantFromJSON)
// - will instantiate either an Integer, Int64, currency, double or string value
// (as RawUTF8), guessing the best numeric type according to the textual content,
// and string in all other cases, except TryCustomVariants points to some options
// (e.g. @JSON_OPTIONS[true] for fast instance) and input is a known object or
// array, either encoded as strict-JSON (i.e. {..} or [..]), or with some
// extended (e.g. BSON) syntax
// - this overloaded procedure will make a temporary copy before JSON parsing
// and return the variant as result
// - warning: by default dvoAllowDoubleValue is set and 32-bit floating-point
// conversion is tried, with potential loss of precision during the conversion
function _JsonStrict(const JSON: RawUTF8;
  Options: TDocVariantOptions = [dvoReturnNullForUnknownProperty];
  const AllowDouble: Boolean = True): Variant;
  {$ifdef HASINLINE}inline;{$endif}

/// retrieve a variant value from a JSON as per RFC 8259, RFC 7159, RFC 7158
// - this global function is an handy alias to:
// ! _JsonStrict(JSON,JSON_OPTIONS_FAST,AllowDouble);
// - follows TTextWriter.AddVariant() format (calls GetVariantFromJSON)
// - will instantiate either an Integer, Int64, currency, double or string value
// (as RawUTF8), guessing the best numeric type according to the textual content,
// and string in all other cases, except TryCustomVariants points to some options
// (e.g. @JSON_OPTIONS[true] for fast instance) and input is a known object or
// array, either encoded as strict-JSON (i.e. {..} or [..]), or with some
// extended (e.g. BSON) syntax
// - this overloaded procedure will make a temporary copy before JSON parsing
// and return the variant as result
// - warning: by default dvoAllowDoubleValue is set and 32-bit floating-point
// conversion is tried, with potential loss of precision during the conversion
function _JsonStrictFast(JSON: RawUTF8;
  const AllowDouble: Boolean = True): Variant;
  {$ifdef HASINLINE}inline;{$endif}

implementation

function _JsonStrictInPlace(const JSON: PUTF8Char; Options: TDocVariantOptions;
  const AllowDouble: Boolean): Variant;
var wasString: boolean;
    Val, Dest: PUTF8Char;
begin
  if JSON = nil then
    ZeroFill(@result) // varEmpty
  else if IdemPChar(JSON, 'NULL') then
    TVarData(result).VType := varNull
  else begin
    Dest := JSON;
    if AllowDouble then
      Dest := TDocVariantData(result).InitJSONInPlace(
        Dest, Options + [dvoAllowDoubleValue])
    else
      Dest := TDocVariantData(result).InitJSONInPlace(Dest, Options);
    if Dest = nil then begin
      Dest := JSON;
      Val := GetJSONField(Dest,Dest,@wasString);
      GetVariantFromJSON(Val,wasString,result,nil,AllowDouble);
    end;
  end;
end;

function _JsonStrict(const JSON: RawUTF8; Options: TDocVariantOptions;
  const AllowDouble: Boolean): Variant;
var tmp: TSynTempBuffer;
begin
  tmp.Init(JSON); // temp copy before in-place decoding
  try
    Result := _JsonStrictInPlace(tmp.buf, Options, AllowDouble);
  finally
    tmp.Done;
  end;
end;

function _JsonStrictFast(JSON: RawUTF8; const AllowDouble: Boolean): Variant;
begin
  Result := _JsonStrict(JSON, JSON_OPTIONS_FAST, AllowDouble);
end;

Last edited by Eugene Ilyin (2020-02-25 17:24:00)

Offline

#6 2020-02-25 17:29:09

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,718
Website

Re: Variant from a JSON UTF-8 text as per RFC 8259, RFC 7159, RFC 7158

I really don't find what it offers in respect to VariantLoadJSON()..

Please follow the forum rules.
Don't post such huge pieces of code in the forum.

Offline

#7 2020-02-25 17:46:14

Eugene Ilyin
Member
From: milky_way/orion_arm/sun/earth
Registered: 2016-03-27
Posts: 132
Website

Re: Variant from a JSON UTF-8 text as per RFC 8259, RFC 7159, RFC 7158

ab wrote:

Please follow the forum rules.
Don't post such huge pieces of code in the forum.

Sorry the code is small, just add procs annotations to save your time if you plan to add it smile

The primary difference from VariantLoadJSON() is that it handles all RFC 8259 types in the same manner, including parsing of objects and arrays, like '{}' and '[]', '[1e300]', or '{"n": -1e-300}':

  V := _JSONStrict('{}'); // TDocVariantData(V).Kind = dvObject
  V := VariantLoadJSON('{}'); // TDocVariantData(V).VarType = varNull

  V := _JSONStrict('[]'); // TDocVariantData(V).Kind = dvArray
  V := VariantLoadJSON('[]'); // TDocVariantData(V).VarType = varNull

  V := _JSONStrict('[1e300]'); // TVarData(V._(0)).VType = varDouble;
  V := VariantLoadJSON('{}'); // TDocVariantData(V).VarType = varNull

  V := _JSONStrict('{"n": -1e-300}'); // TVarData(V._(0)).VType = varDouble;
  V := VariantLoadJSON('{"n": -1e-300}'); // TDocVariantData(V).VarType = varNull

Anyway, if you don't see the reasons to add unified parser for all RFC 8259 types, it's ok, will use it internally.

Offline

#8 2020-02-25 19:00:35

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,718
Website

Re: Variant from a JSON UTF-8 text as per RFC 8259, RFC 7159, RFC 7158

I guess that _JsonStrict/_JsonStrictFast naming is very confusing.
All _* functions are explicitly document-oriented, i.e. generates TDocVariant array/object document.

Perhaps JsonToVariant()/JsonToVariantInPlace() would make more sense as names.
Just as a wrapper to VariantLoadJSON() with the proper default parameters which will parse what you expect.
You can make a pull request with those functions, if you wish.

Offline

#9 2020-02-25 22:49:43

Eugene Ilyin
Member
From: milky_way/orion_arm/sun/earth
Registered: 2016-03-27
Posts: 132
Website

Re: Variant from a JSON UTF-8 text as per RFC 8259, RFC 7159, RFC 7158

Please check #274 with JSONToVariant added.

Offline

#10 2020-02-26 15:48:10

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,718
Website

Re: Variant from a JSON UTF-8 text as per RFC 8259, RFC 7159, RFC 7158

Merged, with some fixes.

Offline

#11 2020-02-26 16:18:00

Eugene Ilyin
Member
From: milky_way/orion_arm/sun/earth
Registered: 2016-03-27
Posts: 132
Website

Re: Variant from a JSON UTF-8 text as per RFC 8259, RFC 7159, RFC 7158

Thanks, works as planned.

Offline

Board footer

Powered by FluxBB