#1 2011-01-22 14:35:05

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 13,061
Website

Synopse Big Table 1.12a

Synopse Big Table is an open source Delphi unit for very fast data storage and access, using key/values pairs, or records organized as field.

With this 1.12a version, the unit has evolved into a true field-oriented database, with two new classes:
- TSynBigTableRecord to store an unlimited number of records with fields;
- TSynBigTableMetaData to store any data (pictures, HTML, text) associated with metadata fields.

Both classes handle variable-length storage of integers, floats, currency, text (Unicode or not) with a field name. On-the-fly field adding, integrated indexing and search capabilities.
Data access can be either fast direct access, or via late-binding (i.e. use Record.Field in your Delphi code).

Classic Key/Value storage is always possible via TSynBigTable or TSynBigTableString, but is now faster and safer. A few issues were corrected.


Database creation

In order to understand how the two new classes work, we will create a new database with some fields:

var Table: TSynBigTableRecord;
    FieldText, FieldInt: TSynTableFieldProperties;
begin
  Table := TSynBigTableRecord.Create('FileName.ext','TableName');
  FieldText := Table.AddField('text',tftWinAnsi,[tfoIndex]);
  FieldInt := Table.AddField('Int',tftInt32,[tfoIndex,tfoUnique]);
  Table.AddFieldUpdate;

The database will be stored in the FileName.ext  file, and will have internally TableName as table name (this table name will be used later, inside our main framework which will use those classes via SQL - you won't have to care about this by now, anyway).
Its first field will be named TEXT, will contain some ansi text (we don't need true Unicode here, and WinAnsi will save some disk space). It will uses an index.
The second field is named INT, and will contain some integer 32 bit value. It will have also an index, and during record creation, it will be checked that every value is unique.

Of course, if the file already exists, the AddField calls won't do nothing: the field layout is stored in the file, so the fields won't be created each time - only if needed.
The AddFieldUpdate method must be called at last, because if some fields were just added from a previous content on file, this method will process the data in order to prepare the storage of these new fields.

It's worth noting that the storage layout on disk will follow a "performance" order: fixed size fields will be placed first in every record, and indexed fields will also be first. Layout on disk won't follow the order in which the AddField method has been called. You can even store integer values in variable-length. It will be a bit slower, but it could save a lot of disk space.
In all cases, just know that it was designed to be fast, and use as less disk space as possible.


Fields and records handling

OK. We have a database with fields.
But how do we handle the data?

There are several ways of handling fields: via direct access or via late-binding.

Direct access will use a TSynTableData record type to store the data of a table record:

var rec: TSynTableData;

  rec.Init(Table.Table);
  rec.Field['TEXT'] := 'Some text';
  rec.SetFieldValue(FieldInt,12345);
  aID := Table.RecordAdd(rec);
  if aID=0 then
    ShowMessage('Error adding record');

The above code will initialize the local rec instance to work with the Table field layout, via the Table.Table property.
Note that the rec instance is an object allowed on stack: you don't have to call any rec.Free or add any try..finally block.
Then a value is set to the TEXT field. The rec.Field[fieldname] can be read or set with any variant value.
The INT field is accessed direcly, via the SetFieldValue method (which is faster than the Field method, because it's not necessary to search for the field name).
Then the record content is added to the database.

Late-binding makes use of a custom variant type:

var vari: Variant;

  vari := Table.VariantVoid;
  vari.text := 'Some text';
  vari.int := 12345;
  if Table.VariantAdd(vari)=0 then
    ShowMessage('Error adding record');

The above code will initialize the local vari instance with a custom variant type "knowing" the Table field layout, via the Table.VariantVoid property.
Then, you can access to the record properties, just by using there name. The custom variant type will retrieve the TSynTableFieldProperties field by late-binding (i.e. during the execution). An exception will be raised in case of wrong field name.
Of course, this has a cost: using this variant type will be slower than directTSynTableData record access (and the faster will be TSynTableData.SetFieldSBFValue method, because it won't use any variant).

You can retrieve a record field content by using one of the two types:

rec := Table.RecordGet(aID);
assert(rec.ID=aID);
assert(rec.GetFieldValue(FieldText)='Some text');
vari := Table.VariantGet(aID);
assert(vari.ID=aID);
assert(vari.Text='Some text');

Some dedicated methods are of course available to update or delete some records.


Search opportunities

Both classes offer advanced search features.
They allow to fast iterate through all records for a value, or can use an internal index, for immediate retrieval:

var IDs: TIntegerDynArray;
    Count: integer;

  assert(Table.Search(FieldText,'Some text',IDs,Count));
  assert(Count=1);
  assert(IDs[0]=aID);

As shown in the above code, you can search for records matching a specified field value. If an index was created with the field (but you can also create later an index to any existing field), search will use this one, and will be immediate.
The Search method returns its results in a array of integer, containing all matching IDs.


Benchmarks

Speed is, with the moderate disk space usage, one major goal of this unit.
Thanks to its unique design, I think you have a hand the fastest database engine for Delphi. Much faster than any SQL engine around, in all cases.

Creating 1,000,000 records with some text and an integer value, both fields using an index, and the integer field set as unique is less than 880 ms on my laptop.
Reading  all 1,000,000 records, and checking both field values takes 220 ms in direct, 360 ms using TSynTableData, and 1560 ms using the late-binding (i.e. using a variant type).
Writing the content to file is about 70 ms. Opening a file 30 ms. Adding a field then recreating the file layout 470 ms.
Searching 50 text values iterating takes 1970 ms; 200 text values using an index only 0.3 ms.
Searching 50 integer values iterating takes 1660 ms; 200 integer values using an index only 0.1 ms.
File size is only 19 MB big, including all data, indexes, and field layout.

Here is the full content of this benchmark, on my Core i7 laptop:

Testing Synopse Big Table classes...


Create 1,000,000 records data in memory  56.0 ms

Create a TSynBigTableMetaData database  0.3 ms
Add 1,000,000 records  889.8 ms
Try to add 125000 records with not unique field  81.0 ms
Read as variant  1540.3 ms
Read as TSynTableData  363.0 ms
Read direct  204.4 ms
Search 50 Text iterating  1367.8 ms
Search 200 Text using index  604.1 ms
Search 50 Int iterating  971.8 ms
Search 200  Int using index  0.1 ms
UpdateToFile  72.5 ms
Close  20.1 ms
Open  53.0 ms
Read as variant  1519.6 ms
Read as TSynTableData  357.5 ms
Read direct  216.8 ms
Search 50 Text iterating  1390.5 ms
Search 200 Text using index  0.3 ms
Search 50 Int iterating  980.5 ms
Search 200  Int using index  0.2 ms
Add a field  0.0 ms
Recreate file with new field layout  93.8 ms
Read as variant  1497.7 ms
Read as TSynTableData  356.4 ms
Read direct  196.1 ms
Search 50 Text iterating  1348.7 ms
Search 200 Text using index  0.3 ms
Search 50 Int iterating  994.5 ms
Search 200  Int using index  0.1 ms
Values: 7.6 MB, file size: 25.7 MB  62.2 ms

Create a TSynBigTableRecord database  0.4 ms
Add 1,000,000 records  875.9 ms
Try to add 125000 records with not unique field  81.7 ms
Read as variant  1562.3 ms
Read as TSynTableData  359.4 ms
Read direct  224.6 ms
Search 50 Text iterating  1970.3 ms
Search 200 Text using index  584.5 ms
Search 50 Int iterating  1665.7 ms
Search 200  Int using index  0.1 ms
UpdateToFile  73.4 ms
Close  2.2 ms
Open  29.2 ms
Read as variant  1593.0 ms
Read as TSynTableData  389.5 ms
Read direct  239.6 ms
Search 50 Text iterating  2279.1 ms
Search 200 Text using index  0.2 ms
Search 50 Int iterating  1948.0 ms
Search 200  Int using index  0.1 ms
Add a field  0.0 ms
Recreate file with new field layout  478.5 ms
Read as variant  1607.7 ms
Read as TSynTableData  372.3 ms
Read direct  230.4 ms
Search 50 Text iterating  2281.5 ms
Search 200 Text using index  0.3 ms
Search 50 Int iterating  1952.0 ms
Search 200  Int using index  0.2 ms
Values: 13.3 MB, file size: 19.0 MB  7.0 ms

Store 1,000,000 records of 8 chars key / 8 chars values  332.9 ms
Read  81.3 ms
UpdateToFile  61.1 ms
Close  9.8 ms
Open  44.1 ms
Verify  137.3 ms
Iteration test in physical order  174.0 ms
Iteration test in ID order  174.9 ms
Iteration speed  23.7 ms
ID[] speed  4.2 ms
GetAllIDs physical order  26.7 ms
GetAllIDs ID order  23.3 ms
GetAllIDs faster order  1.2 ms
GetAllPhysicalIndexes  1.2 ms
Values: 7.6 MB, file size: 18.1 MB  20.8 ms

Creating a TSynBigTable with 3450 elements  5.0 ms
Verify  2.8 ms
Close and Open  6.4 ms
Verify Random  4.8 ms
Verify  4.9 ms
Adding 1000 elements  1.2 ms
Updating 1000 elements  1.2 ms
Deleting 107 elements  0.1 ms
Iteration test  4.2 ms
Packing  7.2 ms
Verify  7.4 ms
Updating 42 elements  0.2 ms
Verify  7.3 ms
AsStream  3.6 ms
Iteration test ID order  5.1 ms
Iteration test physical order  5.0 ms
Iteration speed  0.1 ms
ID[] speed  67.5 ms
GetAllIDs physical order  0.1 ms
GetAllIDs ID order  0.1 ms
GetAllPhysicalIndexes  0.0 ms
Close and Open  1.8 ms
Verify  6.7 ms
Deleting 19 elements  0.0 ms
Iteration test  4.9 ms
Close  1.4 ms
Open  0.1 ms
Verify  6.3 ms
Updating 71 elements  0.3 ms
Verify  5.7 ms
AsStream  3.7 ms
Iteration test ID order  4.9 ms
Iteration test physical order  4.9 ms
Iteration speed  0.1 ms
ID[] speed  68.0 ms
GetAllIDs physical order  0.1 ms
GetAllIDs ID order  0.1 ms
GetAllPhysicalIndexes  0.0 ms
Packing  6.7 ms
Iteration test physical order  6.4 ms
Iteration test ID order  6.5 ms
Iteration speed  0.1 ms
ID[] speed  0.0 ms
Values: 8.3 MB, file size: 8.3 MB  1.4 ms

Creating a GUID-indexed TSynBigTableString with 50 * 4MB elements  145.3 ms
Verify  69.1 ms
Close  117.6 ms
Open  0.1 ms
Verify  212.9 ms
Delete  0.0 ms
Close and Open  17.8 ms
Pack  1817.0 ms
Add one & UpdateToFile  11.1 ms
Verify  203.8 ms
Clear  42.5 ms
Values: 204.0 MB, file size: 204.0 MB

Creating a string-indexed TSynBigTableString with 3450 elements  10.4 ms
Verify  9.2 ms
Close and Open  9.8 ms
Verify  13.2 ms
Verify Random  13.0 ms
Adding 1000 elements  3.1 ms
Verify  13.0 ms
Deleting 40 elements  0.2 ms
Iteration test  4.8 ms
Packing  7.4 ms
Verify  13.0 ms
Updating 61 elements  0.3 ms
Verify  7.2 ms
AsStream  3.6 ms
Iteration test ID order  5.0 ms
Iteration test physical order  5.9 ms
Iteration speed  0.1 ms
ID[] speed  0.0 ms
GetAllIDs physical order  0.1 ms
GetAllIDs ID order  0.2 ms
GetAllPhysicalIndexes  0.0 ms
Close and Open  2.5 ms
Deleting 15 elements  0.0 ms
Iteration test  7.1 ms
Close  1.9 ms
Open  0.3 ms
Verify  12.3 ms
Updating 36 elements  0.1 ms
Verify  5.9 ms
AsStream  3.8 ms
Iteration test ID order  6.2 ms
Iteration test physical order  5.7 ms
Iteration speed  0.1 ms
ID[] speed  0.0 ms
GetAllIDs physical order  0.1 ms
GetAllIDs ID order  0.2 ms
GetAllPhysicalIndexes  0.0 ms
Values: 8.6 MB, file size: 8.6 MB

Tests OK :)
Press [Enter] to quit

We provide an sample executable with the source code, so that you could test it on your own PC.


Get the source and make yourself your idea

Available from our Source Code repository and from a zip archive.
Compiles with Delphi 6 up to Delphi XE (fully Unicode-compatible, even before Delphi 2009).
Licensed under a MPL/GPL/LGPL tri-license, ready to be embedded in any application.

Offline

#2 2011-02-02 12:25:41

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 13,061
Website

Re: Synopse Big Table 1.12a

If you take a look at the detailed benchmark above, you could have noticed this:

Search 50 Text iterating  1367.8 ms
Search 200 Text using index  604.1 ms
Search 50 Int iterating  971.8 ms
Search 200  Int using index  0.1 ms

Is it a bug? Is Text index less efficient than Integer index?

Let's see it later, after closing then opening the file from disk:

Search 50 Text iterating  2279.1 ms
Search 200 Text using index  0.2 ms
Search 50 Int iterating  1948.0 ms
Search 200  Int using index  0.1 ms

So the difference between the two is that the first time the Text search is made, the index is created from scratch.
This made 600 ms.
But later searches were quite immediate: 0.2 ms.

Since the Int field was marked as 'UNIQUE', an index was created and updated on the fly, during the record adding.
Therefore, this index was ready to be used for the first search query.

You can notice also that the 2nd timing are twice slower than the 1st timing: 1367 vs 2279, 971 vs 1948.
This is because the 1st timing was iterating through all items in memory.
Whereas the 2nd was iterating in memory, but via a memory-mapped file access. The later case is a bit slower, because there is some more calculation to be done... Perhaps some place for improvement here. But it's indeed very fast, faster than any "regular" database, in all cases.

Here are the explanations...
No bug nor issue, just optimization. smile

Offline

#3 2011-02-16 20:04:54

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 13,061
Website

Re: Synopse Big Table 1.12a

Version 1.12b has been published (accessible from the same link)
See http://synopse.info/files/SynBigTable.zip

About the issues corrected, see http://synopse.info/forum/viewtopic.php?pid=1368

Offline

#4 2011-07-19 07:42:12

DelphiBiker
Member
Registered: 2011-07-19
Posts: 1

Re: Synopse Big Table 1.12a

Hi,

First of all, thank you for this great site.
Your SynBigTable.zip content is in fact ths SynGDIPlus package for this link :
http://synopse.info/files/SynBigTable.zip

Thank you.

Offline

#5 2011-07-19 11:04:59

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 13,061
Website

Re: Synopse Big Table 1.12a

DelphiBiker wrote:

Your SynBigTable.zip content is in fact ths SynGDIPlus package for this link :
http://synopse.info/files/SynBigTable.zip

I uploaded the wrong file. sad
You can get all files (in their latest revision) from our Source Code repository.
See http://synopse.info/fossil

Offline

#6 2011-07-24 14:30:49

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 13,061
Website

Re: Synopse Big Table 1.12a

I've uploaded the expected file.

Available from http://synopse.info/files/SynBigTable.zip

Offline

Board footer

Powered by FluxBB