#1 2018-01-20 19:25:59

loki
Member
Registered: 2018-01-20
Posts: 8

Base64Decode must raise exception if invalid characters are founded

Just one remark, i think that Base64Decode in syncommon.pas must raise an exception if invalid characters are founded in the input instead an returning random data

Offline

#2 2018-01-20 19:39:35

loki
Member
Registered: 2018-01-20
Posts: 8

Re: Base64Decode must raise exception if invalid characters are founded

NOTE: also the purepascal implementation of Base64Encode is close to 2x more faster than the ASM implementation (I test it on tokyo)

Offline

#3 2018-01-20 20:04:49

loki
Member
Registered: 2018-01-20
Posts: 8

Re: Base64Decode must raise exception if invalid characters are founded

this is the modification i did to have the best performance and the exception is invalid char founded in the input :

type
  TBase64Enc = array[0..63] of AnsiChar;
  TBase64Dec = array[AnsiChar] of shortint;
const
  b64enc: TBase64Enc =
    'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/';

var
  /// a conversion table from Base64 text into binary data
  // - used by Base64ToBin/IsBase64 functions
  // - contains -1 for invalid char, -2 for '=', 0..63 for b64enc[] chars
  ConvertBase64ToBin: TBase64Dec;

const
  sInvalidbase64String = 'Invalid base64 string';

{********************************************************************************************}
function Base64AnyDecode(const decode: TBase64Dec; sp,rp: PAnsiChar; len: NativeInt): boolean;
var c, ch: NativeInt;
begin
  result := false;
  while len>=4 do begin
    c := decode[sp[0]];
    if c<0 then
      exit;
    c := c shl 6;
    ch := decode[sp[1]];
    if ch<0 then
      exit;
    c := (c or ch) shl 6;
    ch := decode[sp[2]];
    if ch<0 then
      exit;
    c := (c or ch) shl 6;
    ch := decode[sp[3]];
    if ch<0 then
      exit;
    c := c or ch;
    rp[2] := AnsiChar(c);
    c := c shr 8;
    rp[1] := AnsiChar(c);
    c := c shr 8;
    rp[0] := AnsiChar(c);
    dec(len,4);
    inc(rp,3);
    inc(sp,4);
  end;
  if len>=2 then begin
    c := decode[sp[0]];
    if c<0 then
      exit;
    c := c shl 6;
    ch := decode[sp[1]];
    if ch<0 then
      exit;
    if len=2 then
      rp[0] := AnsiChar((c or ch) shr 4) else begin
      c := (c or ch) shl 6;
      ch := decode[sp[2]];
      if ch<0 then
        exit;
      c := (c or ch) shr 2;
      rp[1] := AnsiChar(c);
      rp[0] := AnsiChar(c shr 8);
    end;
  end;
  result := true;
end;

{*******************************************************************}
function Base64EncodeMain(rp, sp: PAnsiChar; len: cardinal): integer;
var i: integer;
    c: cardinal;
begin
  result := len div 3;
  for i := 1 to result do begin
    c := ord(sp[0]) shl 16 + ord(sp[1]) shl 8 + ord(sp[2]);
    rp[0] := b64enc[(c shr 18) and $3f];
    rp[1] := b64enc[(c shr 12) and $3f];
    rp[2] := b64enc[(c shr 6) and $3f];
    rp[3] := b64enc[c and $3f];
    inc(rp,4);
    inc(sp,3);
  end;
end;

{*******************************************************}
procedure Base64Decode(sp,rp: PAnsiChar; len: NativeInt);
begin
  len := len shl 2; // len was the number of 4 chars chunks in sp
  if (len>0) and (ConvertBase64ToBin[sp[len-2]]>=0) then
    if ConvertBase64ToBin[sp[len-1]]>=0 then else
      dec(len) else
      dec(len,2); // adjust for Base64AnyDecode() algorithm
  if not Base64AnyDecode(ConvertBase64ToBin,sp,rp,len) then
    raise Exception.Create(sInvalidbase64String);
end;

{***********************************************************************}
procedure Base64EncodeTrailing(rp, sp: PAnsiChar; len: cardinal); inline;
var c: cardinal;
begin
  case len of
    1: begin
      c := ord(sp[0]) shl 4;
      rp[0] := b64enc[(c shr 6) and $3f];
      rp[1] := b64enc[c and $3f];
      rp[2] := '=';
      rp[3] := '=';
    end;
    2: begin
      c := ord(sp[0]) shl 10 + ord(sp[1]) shl 2;
      rp[0] := b64enc[(c shr 12) and $3f];
      rp[1] := b64enc[(c shr 6) and $3f];
      rp[2] := b64enc[c and $3f];
      rp[3] := '=';
    end;
  end;
end;

{*******************************************************}
procedure Base64Encode(rp, sp: PAnsiChar; len: cardinal);
var main: cardinal;
begin
  main := Base64EncodeMain(rp,sp,len);
  Base64EncodeTrailing(rp+main*4,sp+main*3,len-main*3);
end;

{******************************************************}
function BinToBase64Length(len: NativeUInt): NativeUInt;
begin
  result := ((len+2)div 3)*4;
end;

{*******************************************************************}
function Base64ToBinLength(sp: PAnsiChar; len: NativeInt): NativeInt;
begin
  result := 0;
  if (len=0) then exit;
  if (len and 3<>0) then raise Exception.Create(sInvalidbase64String);
  if ConvertBase64ToBin[sp[len-2]]>=0 then
    if ConvertBase64ToBin[sp[len-1]]>=0 then
      result := 0 else
      result := 1 else
      result := 2;
  result := (len shr 2)*3-result;
end;

{***************************************************************************}
procedure Base64ToBin(sp: PAnsiChar; len: NativeInt; var result: AnsiString);
var resultLen: NativeInt;
begin
  resultLen := Base64ToBinLength(sp,len);
  if resultLen=0 then
    result := '' else begin
    SetString(result,nil,resultLen);
    Base64Decode(sp,pointer(result),len shr 2);
  end;
end;

Offline

#4 2018-01-20 20:06:56

loki
Member
Registered: 2018-01-20
Posts: 8

Re: Base64Decode must raise exception if invalid characters are founded

NOTE: to found that the pascal implementation is 2x more faster than the ASM implementation, i run in RELEASE mode

Offline

#5 2018-01-20 22:16:02

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,272
Website

Re: Base64Decode must raise exception if invalid characters are founded

Please don't post such big amount of code in the forum - especially if it is the same Base64AnyDecode() as in SynCommons.pas!
See the rules you accepted when signing in this forum.
Yes, our pascal Base64AnyDecode() version could be faster than asm, for some scenarios, since it has a better CPU pipeline coverage: this pascal version I wrote them some weeks ago, could be more optimized than the straight asm I wrote some years ago!

About the exception, I don't agree with you, since

The first rule is that raising exception should be exceptional - as its name states: exceptional.

See https://synopse.info/files/html/Synopse … l#TITL_165
There are already overloaded functions which track and detect any invalid input, like Base64ToBinSafe().

Offline

#6 2018-01-21 07:56:59

loki
Member
Registered: 2018-01-20
Posts: 8

Re: Base64Decode must raise exception if invalid characters are founded

Base64ToBinSafe is not good to detect error!

function Base64ToBinSafe(sp: PAnsiChar; len: PtrInt; var data: RawByteString): boolean;
begin
  Base64ToBin(sp,len,data);
  result := data<>'';
end;

It's just check if the data <> '' but if the length of the sp look correct and you have invalid char in the middle of the SP then it's will NOT return you an empty data sad

about "The first rule is that raising exception should be exceptional - as its name states: exceptional. "

yes, sure but if Base64ToBin is not a function to return a result (success or not) then it's must return an exception else nothing will stop your program to work with invalid data!

as an example, it's like if strToint can not convert the str to an integer but instead of raising an exception it's return a random integer ....

Offline

#7 2018-01-21 09:21:26

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,272
Website

Re: Base64Decode must raise exception if invalid characters are founded

Please read the doc again.

I have no time to loose arguing.

Offline

#8 2018-01-21 10:03:43

loki
Member
Registered: 2018-01-20
Posts: 8

Re: Base64Decode must raise exception if invalid characters are founded

ok

Offline

#9 2018-01-21 11:46:19

mpv
Member
From: Ukraine
Registered: 2012-03-24
Posts: 1,549
Website

Re: Base64Decode must raise exception if invalid characters are founded

@ab - seems PUREPASCAL version of Base64ToBinSafe have a problem (SynCommons line 27540. Sorry, can't give a http ref to github since SynCommons is huge) - it calls base64toBin which calculate length using Base64ToBinLength instead of Base64ToBinLengthSafe

@loki - it's really impossible to read a huge code you post. You can clone a mORMot repo, create a brunch inside your cloned repo, make a fix and post here only a reference to your fixed brunch. In this case we can easy see a difference. See for example how other contributors create his pull requests

Offline

#10 2018-01-21 15:08:10

loki
Member
Registered: 2018-01-20
Posts: 8

Re: Base64Decode must raise exception if invalid characters are founded

mvp, yes sorry about the code i post, it's even close to the same as the original but just with very little modification. I just added exception in case the input string is incorrect because as opposite to what Arnaud B. think, me i think it's a fundamental mistake to late a process to continue to work with false/random data, and it's can also open some breach in the security. But i will not argue about it, so forget my code i already did a fork of it for my own need.

the pascal implementation of base64 work very fast and it's a very great job !

Offline

#11 2018-01-21 17:05:38

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,272
Website

Re: Base64Decode must raise exception if invalid characters are founded

Offline

#12 2018-01-21 17:27:11

loki
Member
Registered: 2018-01-20
Posts: 8

Re: Base64Decode must raise exception if invalid characters are founded

great job Arnaud!
and again congratulation for you excellent work !!

Offline

Board footer

Powered by FluxBB