#1 2011-03-08 06:41:05

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 6,017
Website

SynCommons: TDynArray and Record compare/load/save using fast RTTI

The SynCommons unit has been enhanced:
  - new BinToBase64 and Base64ToBin conversion functions;
  - new low-level RTTI functions for handling record types: RecordEquals, RecordSave, RecordSaveLength, RecordLoad;
  - new TDynArray object, which is a wrapper around any dynamic array: you can now access to the dynamic array using TList-like properties and methods, e.g. Count, Add, Insert, Delete, Clear, IndexOf, Find, Sort and some new methods like LoadFromStream, SaveToStream, LoadFrom and SaveTo which allow fast binary serialization of any dynamic array, even containing strings or records - a CreateOrderedIndex method is also available to create individual index according to the dynamic array content.

The main purpose of this modification was made after a question posted in How to store dynamic arrays in a TList?

I've created a wrapper around dynamic array RTTI, in order to provide TList-like properties, and even more features.
Since I wanted to handle dynamic array of records, I also created some low-level fast access to the record content, using RTTI.


1. TList-like properties

type
  TGroup: array of integer;

var 
  Group: TGroup;
  GroupA: TDynArray;
  i, v: integer;
begin
  GroupA.Init(TypeInfo(TGroup),Group); // associate GroupA with Group
  for i := 0 to 1000 do begin
    v := i+1000; // need argument passed as a const variable
    GroupA.Add(v);
  end;
  v := 1500;
  if GroupA.IndexOf(v)<0 then // search by content
    ShowMessage('Error: 1500 not found!');
  for i := GroupA.Count-1 downto 0 do
    if i and 3=0 then
      GroupA.Delete(i); // delete integer at index i
end;

This TDynArray wrapper will work also with array of string or array of records... Records need only to be packed and have only not reference counted fields (byte, integer, double...) or string reference-counted fields (no Variant nor Interface within).

Yes, you read well: it will handle a dynamic array of records, in which you can put some strings or whatever data you need.

The IndexOf() method will search by content. That is e.g. for an array of record, all record fields (including strings) must match.


2. Enhanced features

I've added some methods to the TDynArray record/object, which are not available in a plain TList - with those method, we came closer to the generics implementation:
- now you can save and load a dynamic array content to or from a string (using LoadFromStream/SaveToStream or LoadFrom/SaveTo methods) - it will use a proprietary but very fast binary stream layout;
- and you can sort the dynamic array content by two means: either in-place (i.e. the array elements content is exchanged) or via an external integer index look-up array (using the CreateOrderedIndex method - in this case, you can have several orders to the same data);
- you can specify any custom comparison function, and there is a new Find method will can use fast binary search if available.

Here is how those new methods work:

var
  Test: RawByteString;
...
  Test := GroupA.SaveTo;
  GroupA.Clear;
  GroupA.LoadFrom(Test);
  GroupA.Compare := SortDynArrayInteger;
  GroupA.Sort;
  for i := 1 to GroupA.Count-1 do
    if Group[i]<Group[i-1] then
      ShowMessage('Error: unsorted!');
  v := 1500;
  if GroupA.Find(v)<0 then // fast binary search
    ShowMessage('Error: 1500 not found!');

SaveTo and LoadFrom methods are the purpose of adding in SynCommons a BinToBase64 and Base64ToBin functions: we will use the Base64 encoding to load or save any dynamic array of records from a TSQLRecord property... using SQLite3 blob fields, but Base64 for JSON transmission (much more efficient than hexa).

Still closer to the generic paradigm, faster, and working for Delphi 6 up to XE, without the need of the slow enhanced RTTI...


3. More code and sample

The TTestLowLevelCommon._TDynArray method is the automated unitary tests associated with this wrapper.

You'll find out here samples of array of records and more advanced features, with various kind of data.

Here is the interface part of those new RTTI types and functions:

type
  /// function prototype to be used for TDynArray Sort and Search method
  // - common functions exist for base types: e.g. SortDynArrayByte,
  // SortDynArrayWord, SortDynArrayInteger, SortDynArrayInt64,
  // SortDynArrayDouble, SortDynArrayAnsiString, SortDynArrayAnsiStringI,
  // SortDynArrayString, SortDynArrayStringI
  // - any custom type (even records) can be compared then sort by defining
  // such a custom function
  // - must return 0 if A=B, -1 if A<B, 1 if A>B
  TDynArraySortCompare = function(const A,B): integer;

  TDynArray = object
  protected
    Value: PPointer;
    TypeInfo: pointer;
    ElemSize: PtrUInt;
    ElemType: pointer;
    fCompare: TDynArraySortCompare;
    fCountP: PInteger;
    fSorted: boolean;
    function GetCount: integer; {$ifdef HASINLINE}inline;{$endif}
    procedure SetCount(aCount: integer);
    function GetCapacity: integer;
    procedure SetCapacity(aCapacity: integer);
    procedure SetCompare(const aCompare: TDynArraySortCompare); {$ifdef HASINLINE}inline;{$endif}
  public
    /// initialize the structure with a one-dimension dynamic array
    // - the dynamic array must have been defined with its own type
    // (e.g. TIntegerDynArray = array of Integer)
    // - if aCountPointer is set, it will be used instead of length() to store
    // the dynamic array items count - it will be much faster when adding
    // elements to the array, because the dynamic array won't need to be
    // resized each time - but in this case, you should use the Count property
    // instead of length(array) or high(array) when accessing the data: in fact
    // length(array) will store the memory size reserved, not the items count
    // - if aCountPointer is set, its content will be set to 0, whatever the
    // array length is, or the current aCountPointer^ value is
    // - a sample usage may be:
    // !var DA: TDynArray;
    // !    A: TIntegerDynArray;
    // !begin
    // !  DA.Init(TypeInfo(TIntegerDynArray),A);
    // ! (...)
    // - a sample usage may be (using a count variable):
    // !var DA: TDynArray;
    // !    A: TIntegerDynArray;
    // !    ACount: integer;
    // !    i: integer;
    // !begin
    // !  DA.Init(TypeInfo(TIntegerDynArray),A,@ACount);
    // !  for i := 1 to 100000 do
    // !    DA.Add(i); // MUCH faster using the ACount variable
    // ! (...)   // now you should use DA.Count or Count instead of length(A)
    procedure Init(aTypeInfo: pointer; var aValue; aCountPointer: PInteger=nil);
    /// add an element to the dynamic array
    // - warning: Elem must be of the same exact type than the dynamic array,
    // and must be a reference to a variable (you can't write Add(i+10) e.g.)
    // - returns the index of the added element in the dynamic array
    // - note that because of dynamic array internal memory managment, adding
    // will be a bit slower than e.g. with a TList: the list is reallocated
    // every time a record is added - but in practice, with FastMM4 or
    // SynScaleMM, there is no big speed penalty
    function Add(const Elem): integer;
    /// add an element to the dynamic array at the position specified by Index
    // - warning: Elem must be of the same exact type than the dynamic array,
    // and must be a reference to a variable (you can't write Insert(10,i+10) e.g.)
    procedure Insert(Index: Integer; const Elem);
    /// delete the whole dynamic array content
    procedure Clear;
    /// delete one item inside the dynamic array
    // - the deleted element is finalized if necessary
    procedure Delete(Index: Integer);
    /// search for an element value inside the dynamic array
    // - return the index found (0..Count-1), or -1 if Elem was not found
    // - will search for all properties content of the eLement: TList.IndexOf()
    // searches by address, this method searches by content using the RTTI
    // element description (and not the Compare property function)
    // - use the Find() method if you want the search via the Compare property
    // function, or e.g. to search only with some part of the element content
    // - will work with simple types: binaries (byte, word, integer, Int64,
    // Currency, array[0..255] of byte, packed records with no reference-counted
    // type within...), string types (e.g. array of string), and packed records
    // with binary and string types within (like TFileVersion)
    // - won't work with not packed types (like a shorstring, or a record
    // with byte or word fields with {$A+}): in this case, the padding data
    // (i.e. the bytes between the aligned feeds can be filled as random, and
    // there is no way with standard RTTI do know which they are)
    // - warning: Elem must be of the same exact type than the dynamic array,
    // and must be a reference to a variable (you can't write IndexOf(i+10) e.g.)
    function IndexOf(const Elem): integer;
    /// search for an element value inside the dynamic array
    // - return the index found (0..Count-1), or -1 if Elem was not found
    // - this method will use the Compare property function
    // - if the array is sorted, it will use fast binary search
    // - if the array is not sorted, it will use slower iterating search
    // - warning: Elem must be of the same exact type than the dynamic array,
    // and must be a reference to a variable (you can't write Find(i+10) e.g.)
    function Find(const Elem): integer; overload;
    /// search for an element value inside the dynamic array, from an external
    // indexed lookup table
    // - return the index found (0..Count-1), or -1 if Elem was not found
    // - this method will use a custom comparison function, with an external
    // integer table, as created by the CreateOrderedIndex() method: it allows
    // multiple search orders in the same dynamic array content
    // - if an indexed lookup is supplied, it must already be sorted:
    // this function will then use fast binary search
    // - if an indexed lookup is not supplied (i.e aIndex=nil),
    // this function will use slower but accurate iterating search
    // - warning; the lookup index should be synchronized if array content
    // is modified (in case of adding or deletion)
    function Find(const Elem; const aIndex: TIntegerDynArray;
      aCompare: TDynArraySortCompare): integer; overload;
    /// sort the dynamic array elements, using the Compare property function
    // - it will change the dynamic array content, and exchange all elements
    // in order to be sorted in increasing order according to Compare function
    procedure Sort;
    /// sort the dynamic array elements using a lookup array of indexes
    // - it won't change the dynamic array content: only create or update
    // the given integer lookup array, using the specified comparison function
    // - you should provide either a void either a valid lookup table, that is
    // a table with one to one lookup (e.g. created with FillIncreasing)
    // - if the lookup table has less elements than the main dynamic array,
    // its content will be recreated
    procedure CreateOrderedIndex(var aIndex: TIntegerDynArray; aCompare: TDynArraySortCompare);
    /// save the dynamic array content into a (memory) stream
    // - will handle array of binaries values (byte, word, integer...), array of
    // strings or array of packed records, with binaries and string properties
    // - will use a proprietary binary format, with some variable-length encoding
    // of the string length
    // - Stream position will be set just after the added data
    procedure SaveToStream(Stream: TCustomMemoryStream);
    /// load the dynamic array content from a (memory) stream
    // - stream content must have been created using SaveToStream method
    // - will handle array of binaries values (byte, word, integer...), array of
    // strings or array of packed records, with binaries and string properties
    // - will use a proprietary binary format, with some variable-length encoding
    // of the string length
    procedure LoadFromStream(Stream: TCustomMemoryStream);
    /// save the dynamic array content into an allocated memory buffer
    // - Dest buffer must have been allocated to contain at least the number
    // of bytes returned by the SaveToLength method
    // - return a pointer at the end of the data written in Dest
    function SaveTo(Dest: PAnsiChar): PAnsiChar; overload;
    /// compute the number of bytes needed to save a dynamic array content
    function SaveToLength: integer;
    /// save the dynamic array content into a RawByteString
    function SaveTo: RawByteString; overload;
    /// load the dynamic array content from a memory buffer
    // - return nil if the Source buffer is incorrect
    // - in case of success, return the memory buffer pointer just after the
    // read content
    // - return a pointer at the end of the data read from Source
    function LoadFrom(Source: PAnsiChar): PAnsiChar;
    /// retrieve or set the number of elements of the dynamic array
    // - same as length(DynArray) or SetLenght(DynArray)
    property Count: integer read GetCount write SetCount;
    /// the internal buffer capacity
    // - if no external Count pointer was set with Init, is the same as Count
    // - if an external Count pointer is set, you can set a value to this
    // property before a massive use of the Add() method e.g.
    // - if no external Count pointer is set, set a value to this property
    // will affect the Count value, i.e. Add() will append after this count
    property Capacity: integer read GetCapacity write SetCapacity;
    /// the compare function to be used for Sort and Find methods
    // - by default, no comparison function is set
    // - common functions exist for base types: e.g. SortDynArrayByte,
    // SortDynArrayWord, SortDynArrayInteger, SortDynArrayInt64,
    // SortDynArrayDouble, SortDynArrayAnsiString, SortDynArrayAnsiStringI,
    // SortDynArrayString, SortDynArrayStringI
    property Compare: TDynArraySortCompare read fCompare write SetCompare;
    /// must be TRUE if the array is currently in sorted order according to
    // the compare function
    // - Add/Delete/Insert/Load* methods will reset this property to false
    // - Sort method will set this property to true
    // - you MUST set this property to false if you modify the dynamic array
    // content in your code, so that Find() won't try to use binary search in
    // an usorted array, and miss its purpose
    property Sorted: boolean read fSorted write fSorted;
  end;

/// check equality of two records by content
// - will handle packed records, with binaries (byte, word, integer...) and
// string types properties
// - will use byte-level comparison: it could fail to match two floating-point
// values because of rounding issues (Currency won't have this problem)
function RecordEquals(const RecA, RecB; TypeInfo: pointer): boolean;

/// save a record content into a RawByteString
// - will handle packed records, with binaries (byte, word, integer...) and
// string types properties
// - will use a proprietary binary format, with some variable-length encoding
// of the string length
// - warning: will encode generic string fields as AnsiString (one byte per char)
// prior to Delphi 2009, and as UnicodeString (two bytes per char) since Delphi
// 2009: if you want to use this function between UNICODE and NOT UNICODE
// versions of Delphi, you should use some explicit types like RawUTF8,
// WinAnsiString or even RawUnicode
function RecordSave(const Rec; TypeInfo: pointer): RawByteString; overload;

/// save a record content into a destination memory buffer
// - Dest must be at least RecordSaveLength() bytes long
// - will handle packed records, with binaries (byte, word, integer...) and
// string types properties
// - will use a proprietary binary format, with some variable-length encoding
// of the string length
// - warning: will encode generic string fields as AnsiString (one byte per char)
// prior to Delphi 2009, and as UnicodeString (two bytes per char) since Delphi
// 2009: if you want to use this function between UNICODE and NOT UNICODE
// versions of Delphi, you should use some explicit types like RawUTF8,
// WinAnsiString or even RawUnicode
function RecordSave(const Rec; Dest: PAnsiChar; TypeInfo: pointer): PAnsiChar; overload;

/// compute the number of bytes needed to save a record content
// using the RecordSave() function
// - will return 0 in case of an invalid (not handled) record type (e.g. if
// it contains a variant or a dynamic array)
function RecordSaveLength(const Rec; TypeInfo: pointer): integer;

/// fill a record content from a memory buffer as saved by RecordSave()
// - return nil if the Source buffer is incorrect
// - in case of success, return the memory buffer pointer just after the
// read content
function RecordLoad(var Rec; Source: PAnsiChar; TypeInfo: pointer): PAnsiChar;

/// compare two "array of word" elements
function SortDynArrayWord(const A,B): integer;

/// compare two "array of integer" elements
function SortDynArrayInteger(const A,B): integer;

/// compare two "array of Int64 or array of Currency" elements
function SortDynArrayInt64(const A,B): integer;

/// compare two "array of double" elements
function SortDynArrayDouble(const A,B): integer;

/// compare two "array of AnsiString" elements, with case sensitivity
function SortDynArrayAnsiString(const A,B): integer;

/// compare two "array of AnsiString" elements, with no case sensitivity
function SortDynArrayAnsiStringI(const A,B): integer;

/// compare two "array of string" elements, with case sensitivity
function SortDynArrayString(const A,B): integer;

/// compare two "array of string" elements, with no case sensitivity
function SortDynArrayStringI(const A,B): integer;

Offline

#2 2011-03-12 12:26:25

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 6,017
Website

Re: SynCommons: TDynArray and Record compare/load/save using fast RTTI

In my TODO list:
- Slice and Reverse methods;
- AddArray(const DynArray) method;
- Search method returning a new dynamic array containing all matching records;
- native JSON serialization (TIntegerDynArray as '[1,2,3]', TRawUTF8DynArray as '["one","two","three"]', array of record as Base64 string field) - this will be used for our ORM framework;
- ...

Offline

#3 2011-03-13 16:17:14

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 6,017
Website

Re: SynCommons: TDynArray and Record compare/load/save using fast RTTI

I've added external Count feature to speed up Add and Delete methods.

See updated interface above (and the new parameter in Init - and the Capacity property), and the Blog article (which has been refreshed).

Offline

#4 2011-03-14 16:28:31

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 6,017
Website

Re: SynCommons: TDynArray and Record compare/load/save using fast RTTI

OK - now Slice, Reverse and AddArray methods have been implemented.

And the JSON serialization is working.
New TTextWriter.AddDynArrayJSON and TDynArray.LoadFromJSON methods are available for UTF-8 JSON serialization of dynamic arrays.

  TTextWriter = class
 (...)
/// append a dynamic array content as UTF-8 encoded JSON array
// - expect a dynamic array TDynArray wrapper as incoming parameter
// - TIntegerDynArray, TInt64DynArray, TCardinalDynArray, TDoubleDynArray,
// TCurrencyDynArray, TWordDynArray and TByteDynArray will be written as
// numerical JSON values
// - TRawUTF8DynArray, TWinAnsiDynArray, TRawByteStringDynArray and
// TStringDynArray will be written as escaped UTF-8 JSON strings
// - any other kind of dynamic array (including array of records) will be
// written as Base64 encoded binary stream
procedure AddDynArrayJSON(const DynArray: TDynArray);

and

TDynArray = object
/// load the dynamic array content from an UTF-8 encoded JSON buffer
// - expect the format as saved by TTextWriter.AddDynArrayJSON method, i.e.
// handling TIntegerDynArray, TInt64DynArray, TCardinalDynArray,
// TDoubleDynArray, TCurrencyDynArray, TWordDynArray, TByteDynArray,
// TRawUTF8DynArray, TWinAnsiDynArray, TRawByteStringDynArray and
// TStringDynArray as CSV - and
// other kind of array as Base64 encoded binary stream
// - return a pointer at the end of the data read from P, nil in case
// of an invalid input buffer
// - warning: the content of P^ will be modified during parsing: please
// make a local copy if it will be needed later
function LoadFromJSON(P: PUTF8Char): PUTF8Char;

As you can see, most common dynamic arrays (of byte, word, integer, cardinal, Int64, double, currency, RawUTF8, WinAnsiString, string) will be serialized as a list of JSON elements.
Other not-known dynamic arrays (like any array of packed records) will be serialized as binary, then Base64 encoded.

Next step is to integrate this into our ORM.
But it won't be difficult, because we have now all low-level functions for handling dynamic arrays.

Offline

#5 2011-03-15 17:19:03

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 6,017
Website

Re: SynCommons: TDynArray and Record compare/load/save using fast RTTI

Here is some code which mimic the generic sample code from http://docwiki.embarcadero.com/CodeExam … y_(Delphi)

First, declaration of the types (more easy/readable) than with generics:

type
  TCity = record
    Name: string;
    Country: string;
    Latitude: double;
    Longitude: double;
  end;
  TCityDynArray = array of TCity;

Then the variables which we will use: one dynamic array, one item, and our wrapper:

var
    City: TCity;
    Cities: TCityDynArray;
    ACities: TDynArray;

Then we create some items:

 
begin 
  ACities.Init(TypeInfo(TCityDynArray),Cities);
  City.Name := 'Iasi';
  City.Country := 'Romania';
  City.Latitude := 47.16;
  City.Longitude := 27.58;
  ACities.Add(City);
  City.Name := 'London';
  City.Country := 'United Kingdom';
  City.Latitude := 51.5;
  City.Longitude := -0.17;
  ACities.Add(City);
  City.Name := 'Buenos Aires';
  City.Country := 'Argentina';
  City.Latitude := 0;
  City.Longitude := 0;
  ACities.Add(City);
  Check(ACities.Count=3);

Then we will use the standard string comparison for search for only the first TCity field, i.e. the name:

  ACities.Compare := SortDynArrayString; // will search by Name = 1st field
  City.Name := 'Iasi';
  Check(ACities.FindAndFill(City)=0);
  Check(City.Name='Iasi');
  Check(City.Country='Romania');
  CheckSame(City.Latitude,47.16);
  CheckSame(City.Longitude,27.58);
  Check(ACities.FindAndDelete(City)=0);
  Check(City.Name='Iasi');
  Check(ACities.Find(City)<0);
  City.Name := 'Buenos Aires';
  City.Country := 'Argentina';
  City.Latitude := -34.6;
  City.Longitude := -58.45;
  Check(ACities.FindAndUpdate(City)>=0);
  City.Latitude := 0;
  City.Longitude := 0;
  Check(City.Name='Buenos Aires');
  Check(ACities.FindAndFill(City)>=0);
  CheckSame(City.Latitude,-34.6);
  CheckSame(City.Longitude,-58.45);
  Check(ACities.FindAndAddIfNotExisting(City)>=0);
  City.Name := 'Iasi';
  City.Country := 'Romania';
  City.Latitude := 47.16;
  City.Longitude := 27.58;
  Check(ACities.FindAndAddIfNotExisting(City)<0);
  Check(City.Name='Iasi');
  Check(ACities.FindAndFill(City)>=0);

Note the new FindAndFill / FindAndDelete / FindAndUpdate / FindAndAddIfNotExisting methods... you have some kind of enhanced dictionary here! Since you can specify the record, you can have much more powerful commands than just a key/value pair.

And we can sort the records by name in just one line:

  ACities.Sort;
  for i := 1 to high(Cities) do
    Check(Cities[i].Name>Cities[i-1].Name);

Nice and easy, isn't it?
I even prefer this new TDynArray wrapper to the existing generic feature of Delphi XE...
smile

Offline

#6 2011-03-18 17:37:36

JonG
New member
Registered: 2010-11-12
Posts: 7

Re: SynCommons: TDynArray and Record compare/load/save using fast RTTI

I love the new TDynArray functionality!

Will TDynArray handle records inside of records?

type
  TName = Record
    First : String;
    Last : String;
    end;
 
   TAddress = Record
     Street : String;
     Suite : String;
     City : String;
     State : String;
     Zip : String;
     end;
   
   TPerson = Record
     Name : TName;
     Address : TAddress;
     end;
   TPersonDynArray = array of TPerson;

Offline

#7 2011-03-19 08:00:08

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 6,017
Website

Re: SynCommons: TDynArray and Record compare/load/save using fast RTTI

It's stated in the documentation (i.e. source code comments) that it won't fully handle records inside records, only plain dynamic array of records.

To be more precise, it will handle record within record:
- for most common Add/Delete/Insert/Sort/FindAnd* methods - for any kind of child records;
- for more advanced methods (like streaming or IndexOf) only if the child records is pure binary (no reference counted fields inside, like strings).

Edit: it can now handle nested records - see http://synopse.info/forum/viewtopic.php?pid=1810#p1810

Offline

#8 2011-04-13 01:34:56

richard6688
Member
Registered: 2011-04-05
Posts: 31

Re: SynCommons: TDynArray and Record compare/load/save using fast RTTI

I must say TDynArray is great to process array. Currently I use it to packed data into RawByteString for TBigTableRecord storage.
I found a bug maybe, when pack data string large near 64K block,  the  error occured.
Test code:

var
I, E: integer;
R: RawByteString;
D: TDynArray;
IA: TIntegerDynArray;
begin
  D.Init(TypeInfo(TIntegerDynArray), IA);
  for I := 0 to 16256 do
  begin
    E:= I*5;
    D.Add(E);
  end;
  R:=D.SaveTo;
  Memo1.Lines.Add(R);

when Execte, the error found on function

function TDynArray.SaveTo: RawByteString;
var Len: integer;
begin
  Len := SaveToLength;
  SetString(result,nil,Len);
  if Len<>0 then
    if SaveTo(pointer(result))-pointer(result)<>Len then
      Assert(false);
end;

the pointer Calculate seems not right on block large or near 64k.

Offline

#9 2011-04-13 05:52:32

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 6,017
Website

Re: SynCommons: TDynArray and Record compare/load/save using fast RTTI

You were right: there was an issue in ToVarUInt32Length/ToVarUInt32LengthWithData.

I just fixed it, and add all associated regression tests.
See http://synopse.info/fossil/info/0a7457ef69

Sorry for the issue.
Thanks for the report and for supplying the code to reproduce!

Offline

#10 2011-04-13 11:30:17

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 6,017
Website

Re: SynCommons: TDynArray and Record compare/load/save using fast RTTI

JonG wrote:

I love the new TDynArray functionality!

Will TDynArray handle records inside of records?

I just modified the source code so that it could be able to handle records inside records, and even dynamic arrays inside records, and records inside dynamic arrays of records inside dynamic arrays, and so on... smile

Offline

#11 2011-04-24 09:42:37

Starkis
Member
From: Up in the space
Registered: 2011-01-16
Posts: 25

Re: SynCommons: TDynArray and Record compare/load/save using fast RTTI

Can the TDynArray be summarized as the generic List< T > implementation for every type of T? I mean it is equially efficient as List< integer >, List< string >, List< MyCustomRecord > and List< TMyCustomClass > implementations for D6+ or there are some guides to follow for particular T type?

Lack of the generics for D7 (and static class attributes) is a very serious drawback in my eyes after C++/C#


--- we no need no water, let the ... burn ---

Offline

#12 2011-04-24 19:00:46

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 6,017
Website

Re: SynCommons: TDynArray and Record compare/load/save using fast RTTI

Starkis wrote:

Can the TDynArray be summarized as the generic List< T > implementation for every type of T? I mean it is equially efficient as List< integer >, List< string >, List< MyCustomRecord > and List< TMyCustomClass > implementations for D6+ or there are some guides to follow for particular T type?

IMHO you can do that.

There are even some facts that make it even better than TList<T>: for instance, it's reference counted, and this wrapper allow content serialization.

I wrote a whole dedicated part in the updated documentation of the framework about that (to be released in the 1.13 version, and already available in the Source code repository as .pro source file and from http://synopse.info/files/synproject/sampledoc.zip ).

Offline

#13 2011-07-28 14:58:00

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 6,017
Website

Re: SynCommons: TDynArray and Record compare/load/save using fast RTTI

Note that you have a TDynArrayHashed wrapper, which will use hashing (via a custom hash function) for faster search.

The TDynArrayHashed wrapper was developed to be fast.

Here is some sample code (from supplied unitary tests):

var ACities: TDynArrayHashed;
    Cities: TCityDynArray;
    CitiesCount: integer;
    City: TCity;
    added: boolean;
    N: string;
    i,j: integer;
const CITIES_MAX=200000;
begin
  // valide generic-like features
  // see http://docwiki.embarcadero.com/CodeExamples/en/Generics_Collections_TDictionary_(Delphi)
  ACities.Init(TypeInfo(TCityDynArray),Cities,nil,nil,nil,@CitiesCount);
  (...)
  Check(ACities.FindHashed(City)>=0);
  for i := 1 to 2000 do begin
    City.Name := IntToStr(i);
    City.Latitude := i*3.14;
    City.Longitude := i*6.13;
    Check(ACities.FindHashedAndUpdate(City,true)=i+2,'multiple ReHash');
    Check(ACities.FindHashed(City)=i+2);
  end;
  ACities.Capacity := CITIES_MAX+3; // make it as fast as possible
  for i := 2001 to CITIES_MAX do begin
    City.Name := IntToStr(i);
    City.Latitude := i*3.14;
    City.Longitude := i*6.13;
    Check(ACities.FindHashedAndUpdate(City,true)=i+2,'use Capacity: no ReHash');
    Check(ACities.FindHashed(City.Name)=i+2);
  end;
  for i := 1 to CITIES_MAX do begin
    N := IntToStr(i);
    j := ACities.FindHashed(N);
    Check(j=i+2,'hashing with string not City.Name');
    Check(Cities[j].Name=N);
    CheckSame(Cities[j].Latitude,i*3.14);
    CheckSame(Cities[j].Longitude,i*6.13);
  end;
end;

Another code for a question in StackOverflow:

type
  TMyMap = record
    Key: string;
    Value: array of string;
  end;
  TMyMapDynArray = array of TMyMap;

var
  Map: TMyMap;
  Maps: TMyMapDynArray;
  MapW: TDynArrayHashed;
  key: string;
  i: integer;
begin
  MapW.Init(TypeInfo(TMyMapDynArray),Maps);
  Map.Key := 'Some key';
  SetLength(Map.Value,2);
  Map.Value[0] := 'One';
  Map.Value[1] := 'Two';
  MapW.FindHashedAndUpdate(Map,true); // ,true for adding the Map content
  key := 'Some key';
  i := MapW.FindHashed(key);
  // now i=0 and Maps[i].Key=key
  for i := 0 to MapW.Count-1 do // or  for i := 0 to high(Maps) do
    with Maps[i] do
    // now you're enumerating all key/value pairs
end;

See http://stackoverflow.com/questions/5536 … ap#5536647

Offline

#14 2012-01-04 04:02:15

richard6688
Member
Registered: 2011-04-05
Posts: 31

Re: SynCommons: TDynArray and Record compare/load/save using fast RTTI

I use TSynBigTableString to store record as RawByteString by RecordLoad/RecordSave. When I try to search the data, I found Getpointer is very fast, but RecordLoad is slow when record is complicated and associated big dynArray. I know that large data  to be moved in memory.  Is it possible to use pointer here. If it is possible, could you explain the format of RecordSave here, which will save a lot of time for me to dig the code?

Last edited by richard6688 (2012-01-04 04:03:16)

Offline

#15 2012-01-04 09:30:52

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 6,017
Website

Re: SynCommons: TDynArray and Record compare/load/save using fast RTTI

RecordLoad/RecordSave serialization just maps the binary representation of the dynamic array elements.

RecordLoad can be slow since it allocates all data. It is necessary to get a valid dynamic array.

You can search into the dynamic array content, if you know the exact kind of data stored within.
You can map your record using a pointer, but it won't work for dynamic arrays or strings directly. You'll have to parse the dynamic array by hand. It is for instance what SimpleDynArrayLoadFrom(), IntegerDynArrayLoadFrom() and RawUTF8DynArrayLoadFromContains() functions do.

I do not have any detailed information of the serialization itself, except the source code of RecordLoad/RecordSave, which is easy to read, and commented.
In short:
- Non referenced records (boolean, byte, integer, double...) are just stored as binary;
- strings are stored with a VarUInt32 length first, ending with no #0;
- dynamic arrays are stored as with TDynArray.LoadFrom, i.e. stored as one huge binary block for non reference items, but with inlined VarUInt32 length for strings and inlined RecordLoad for records containing reference records - dynamic array serialization blocks also includes a hash;
- nested non reference records (only containing boolean, byte...) are stored as binary;
- nested records containing references are recursively loaded with RecordLoad.

You can unserialize your record pattern, but it will be highly depended of the record structure.
A more generic search function may be written.
I could be able to implement a function able to provide a class able to unserialize a record or a dynamic array on the fly, and calling internal methods for each kind of data. With parameters mapping the nested level of recursion. With "classic" level of RTTI, we can't get each property name. With Delphi 2010+ enhanced RTTI, it may be possible, but we may loose compatibility with older Delphi and FPC.

In all cases, it sounds mandatory to let your non reference members of the record grouped in the first part of the record, then let all reference records after this fixed part. It will help return a pointer to these binary content.

Offline

#16 2012-01-04 11:46:26

richard6688
Member
Registered: 2011-04-05
Posts: 31

Re: SynCommons: TDynArray and Record compare/load/save using fast RTTI

Thank you very much.
I will give the try, because if I load all data into memory in advance, it's search speed is lightning, faster than any other approch I can implement.
Anyway I am not going to write a general one, it's maybe easier.

Thank you again for you advice.

Last edited by richard6688 (2012-01-04 11:52:56)

Offline

#17 2012-01-04 12:19:59

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 6,017
Website

Re: SynCommons: TDynArray and Record compare/load/save using fast RTTI

About search speed, you may also consider using some metadata and indexes for most used fields.

You can keep your main data stored as record as RawByteString via RecordLoad/RecordSave.
But use TSynBigTableMetaData instead of TSynBigTableString to add some meta data fields.

This meta data fields will be loaded and kept in memory, and can contain indexes.
If you want to search for some key fields, it would be the faster approach. Some data will be duplicated from the main records into the metadata fields, but search will be immediate.
And for more precise search, you can use the record parsing of all data.

Offline

#18 2012-01-05 01:01:58

richard6688
Member
Registered: 2011-04-05
Posts: 31

Re: SynCommons: TDynArray and Record compare/load/save using fast RTTI

In detail, I save the trie in the data field, that's why I try to search in the field. Of course, it's may be change into normal way, but the speed may vary.
I need more test.
Thanks for your suggestion.

Offline

#19 2012-04-12 15:30:40

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 6,017
Website

Re: SynCommons: TDynArray and Record compare/load/save using fast RTTI

I've added TTextWriter.RegisterCustomJSONSerializer() method to allow JSON serialization of any dynamic array content (used by TDynArray.LoadFromJSON and TTextWriter.AddDynArrayJSON).

So dynamic arrays can now serialize valid JSON content, even for custom types.
Of course, unregistered dynamic array types will be serialized as binary + Base64 encoding, just as before.
But true JSON serialization make our TDynArray wrapper much more AJAX friendly.

See http://synopse.info/fossil/info/36d0fcd3c7 and http://synopse.info/forum/viewtopic.php?pid=4081#p4081 for the feature proposal from mpv.

Documentation has been updated to reflect this enhancement.
See http://blog.synopse.info/post/2012/04/1 … ay-content for the associated blog entry.

Offline

#20 2012-08-15 16:27:22

mpv
Member
From: Ukraine
Registered: 2012-03-24
Posts: 280

Re: SynCommons: TDynArray and Record compare/load/save using fast RTTI

AB, please, look on this code.
If I use record for DynArray what not contain string, I got VERY unexpected results for me...   I can't use FindHashed and so on for first record field in this case - see comments in source below.
If it's by design - please make remark in documentation.... But I will be very happy if it's a bug........ I'm on Delphi XE2.

type
  TBinRec = packed record
    intVal: integer;
    counter: cardinal;
  end;
  TBinArr = array of TBinRec;

  TStrRec = record
    intVal: integer;
    strVal: string;
  end;
  TStrArr = array of TStrRec;
procedure TfrmHTTPClient.Button2Click(Sender: TObject);
var
  i: integer;
  StrArr: TStrArr; se: TStrRec;
  BinArr: TBinArr; be: TBinRec;
  BinDA, StrDA: TDynArrayHashed;
begin
  with BinDA do begin
    Init(TypeInfo(TBinRec), BInArr, nil, nil, nil, nil, False);
    with be do begin intVal := 100; counter := 1; end;
    Add(be);  ReHash;
    i := 100;
    i := FindHashed(i); //Unexpected result(Access violation or -1),
                        //cause internay DynArray.ElemType = nil and there is no compare function for first record element
  end;
  with StrDA do begin
    Init(TypeInfo(TStrArr), StrArr, nil, nil, nil, nil, False); //DynArray.ElemType is OK
    with se do begin intVal := 100; strVAl := '1'; end;
    Add(se); ReHash;
    i := 100;
    i := FindHashed(i); // all ok i = 0.
  end;
end;

the problem is here

procedure TDynArray.Init(aTypeInfo: pointer; var aValue; aCountPointer: PInteger=nil);
var Typ: PDynArrayTypeInfo absolute aTypeInfo;
begin
  TypeInfo := aTypeInfo;
  Value := @aValue;
  if Typ^.Kind<>tkDynArray then
    raise ESynException.CreateFmt('%s is not a dynamic array',[PShortString(@Typ^.NameLen)^]);
  inc(PtrUInt(Typ),Typ^.NameLen);
  {$ifdef FPC}Typ := AlignToPtr(Typ);{$endif}
  with Typ^ do begin
    ElemSize := elSize;
    if elType=nil then  //!!!!!!!!!!!!!!  elType = nil for TBinArr  
      ElemType := nil else

Hmm. and other issue/ This is e real record (I implement locking using some of your classes )

this not work with dyn array

TObjectLock = record
    Essence: TObject;
    Locks: TSQLLocks;
end;
Tm3EssenceLocks = array of Tm3EssenceLock;

this not work too - ElemType not nil but Compare function not defined

TObjectLock = record
    Essence: TObject;
    Locks: TSQLLocks;
    s: string;
end;

this work as expected

TObjectLock = record
    Essence: TObject;
    s: string;
    Locks: TSQLLocks;
end;

Seems like low level RTTI structure not in format your code expect

Last edited by mpv (2012-08-15 16:44:43)

Offline

#21 2012-08-15 20:26:12

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 6,017
Website

Re: SynCommons: TDynArray and Record compare/load/save using fast RTTI

You need to specify a compare function and a custom hash function for such kind of dynamic array.

That is, both aHashElement: TDynArrayHashOne and aCompare: TDynArraySortCompare parameters at .init() call should not be left as nil.

Offline

#22 2012-08-16 07:26:49

mpv
Member
From: Ukraine
Registered: 2012-03-24
Posts: 280

Re: SynCommons: TDynArray and Record compare/load/save using fast RTTI

AB, I understand what if I provide aHashElement and aCompare to Init call everything will be OK. But WHEN I must provide this params and when DynArray define it automaticly I don't understand. So, I catch very-very bad bug. For example I define record

TAmount = record
    firmID: integer;
    amount: string;
end;
TAmountCollection = array of TAmount;
AmounCollection: TAmountCollection;
AmountDA: TDynArrayHashed;
AmountDA.Init(TypeInfo(TAmountCollection), AmounCollection, nil, nil, nil, nil, False);

and 

function getAmount(firmID) 
  idx := AmountDA.FindHashed(firmID); if idx >-1 then result := arr[idx].amount else result := '0';

Everything is work fine and I'm happy.
After some time I decide to change record defenition

TAmount = record
    firmID: integer;
    amount: integer;
end;

and

function getAmount(firmID) 
  idx := AmountDA.FindHashed(firmID); if idx >-1 then result := arr[idx].amount else result := 0;

My code become unworked. And no exceptions, no warning. Very-very big problem when debug such kind of code......
This is a simple example. In real life record is more complex and find such exception is a hell work.
So I must provide aHashElement and aCompare for ANY array of record element to not catch bug in future?
I think in your code you have risk of this exceptions too...

Last edited by mpv (2012-08-16 07:29:55)

Offline

#23 2012-08-16 14:35:19

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 6,017
Website

Re: SynCommons: TDynArray and Record compare/load/save using fast RTTI

In both cases, the default record elements are not the same.

Your 1st kind of record will by default search for firmID only: i.e. for the binary before the string element in the record.
TDynArrayHashed.Init will set:
- fHashElement = nil so HashOne() will hash whole starting binary content of record, i.e. until Offset=4 in line "result := fHasher(0,@Elem,Offset)"
- fCompare = SortDynArrayCardinal

Your 2nd kind of record will by default search for the WHOLE record binary content: i.e. both firmID + amount values.
So AmountDA.FindHashed(firmID) will never be >=0 (since amount will never match the random value on stack), and you may have random access violation because you are searching for firmID and you'll read 8 bytes from its buffer.
Therefore, FindHashed() shall use PLAIN record, i.e. a TAmount in your case.

IMHO there is no bug, just you have to specify the extent of the search, otherwise it will search for the WHOLE record binary content, i.e. 8 bytes.
There is no way (from basic RTTI) to know that you are searching for the first field here: it will search for the WHOLE 8 bytes.

You should have coded:

var A: Amount;
...
A.firmID := aSearchedFirmID;
A.Amount := 100; // searched amount
idx := AmountDA.FindHashed(A);
...

But in your case, it is not what you search for.

I've added a new initialization method for TDynArrayHashed, named InitSpecific().
You will be able to specify a kind of hashing/comparison field directly:

AmountDA.InitSpecific(TypeInfo(TAmountCollection), AmounCollection, djInteger);

Then it will only hash and test for the firmID integer field.

See http://synopse.info/fossil/info/31d8d92795
Some specific regression tests have been included (see this commit).

The TDynArray wrapper has now its InitSpecific() new method.

Offline

#24 2012-08-17 08:30:36

mpv
Member
From: Ukraine
Registered: 2012-03-24
Posts: 280

Re: SynCommons: TDynArray and Record compare/load/save using fast RTTI

Yes!! This is the thing I need. Now my array not depend of possible record modification (except first record field - but its understandable and expected).  Many thanks!!!!!!
Just one question:
for records

TObjectLock = record
    Essence: TObject;
    Locks: TSQLLocks;
end;
TObjectLocks = array of TObjectLock;

I now write

fEssenceLocksDA.InitSpecific(TypeInfo(TObjectLocks), fObjectLocks, djCardinal);

May be add type djPointer (or djObject) ?

And there is small issue compiling in XE2 with SynCommons
condition define {$ifndef ISDELPHI2007ANDUP} must end before
procedure FillCharX87
(not after as now)
because FillCharX87 used in initialization section of unit

Offline

#25 2012-08-17 08:47:24

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 6,017
Website

Re: SynCommons: TDynArray and Record compare/load/save using fast RTTI

Indeed.

Should be fixed by http://synopse.info/fossil/info/ee05c34780
- fixed compilation issue with Delphi >= 2007
- added djPointer and djObject TDynArrayKind aliases (not true items, just alias to the corresponding integer types, depending on the CPU model used)

Offline

Board footer

Powered by FluxBB