So I have fixed the default behavior of TRawUtF8List to match mORMot 1 behavior.
But this is a BREAKING CHANGE: TRawUtf8List is not thread-safe by default.
https://github.com/synopse/mORMot2/commit/cfffde92
Now TRawUtf8ListHashed behave just the like as in mORMot 1, which is what is expected, and what confused you.
]]>What kind of types are you storing in the TObjectDictionary / TSynDictionary?
TSynDictionary as some more features (like binary or JSON serialization), e.g. it is thread-safe by default: set TSynDictionary.ThreadUse := uNoLock for no locking.
Also consider IKeyValue<> from mormot.core.collections.
Then, TSynDictionary is a bit slower than TObjectDictionary...
What performance improvements are suggested?
]]>I'm getting slow performance in Mormot2
2000000 TRawUTF8ListHashed.AddObject in 88.52ms (22592487/s)
TRawUTF8ListHashed.Rehash in 1us
2000000 TRawUTF8ListHashed.AddObject in 88.52ms (22592487/s)
TRawUTF8ListHashed.Rehash in 1us
TRawUTF8ListHashed.Clear in 15.92ms
2000000 TRawUTF8ListHashed.AddObjectIfNotExisting in 89.07ms (22452484/s)
first TRawUTF8ListHashed.IndexOf in 5us
20000 TRawUTF8ListHashed.IndexOf in 752.33ms (26584/s)
2000000 TSynDictionary.Add in 510.91ms (3914537/s)
first TSynDictionary.FindAndCopy in 23us
2000000 TSynDictionary.FindAndCopy in 450.31ms (4441384/s)
2000000 TObjectDictionary.Add in 716.41ms (2791674/s)
first TObjectDictionary.TryGetValue in 1us
2000000 TObjectDictionary.TryGetValue in 211.21ms (9469114/s)
2000000 TObjectDictionary.Add locked in 628.33ms (3183000/s)
first TObjectDictionary.TryGetValue in 1us
2000000 TObjectDictionary.TryGetValue locked in 274.98ms (7273044/s)
Does make sense.
Please check https://synopse.info/fossil/info/10f29366b5
That's it, tnx.
]]>2000000 TRawUTF8ListHashed.AddObject in 52.74ms (37915410/s)
TRawUTF8ListHashed.Rehash in 121.10ms
TRawUTF8ListHashed.Clear in 16.97ms
2000000 TRawUTF8ListHashed.AddObjectIfNotExisting in 479.96ms (4166987/s)
first TRawUTF8ListHashed.IndexOf in 0us
2000000 TRawUTF8ListHashed.IndexOf in 127.96ms (15629517/s)
2000000 TRawUTF8ListHashedLocked.AddObject in 91.95ms (21750951/s)
TRawUTF8ListHashedLocked.Rehash in 122.47ms
TRawUTF8ListHashedLocked.Clear in 17.15ms
2000000 TRawUTF8ListHashedLocked.AddObjectIfNotExisting in 544.96ms (3669973/s)
first TRawUTF8ListHashedLocked.LockedGetObjectByName in 0us
2000000 TRawUTF8ListHashedLocked.LockedGetObjectByName in 294.85ms (6783041/s)
2000000 TSynDictionary.Add in 352.70ms (5670493/s)
first TSynDictionary.FindAndCopy in 3us
2000000 TSynDictionary.FindAndCopy in 256.23ms (7805365/s)
2000000 TObjectDictionary.Add in 681.99ms (2932555/s)
first TObjectDictionary.TryGetValue in 0us
2000000 TObjectDictionary.TryGetValue in 251.48ms (7952918/s)
2000000 TObjectDictionary.Add locked in 769.16ms (2600235/s)
first TObjectDictionary.TryGetValue in 0us
2000000 TObjectDictionary.TryGetValue locked in 352.53ms (5673276/s)
Sounds like if TSynDictionary, as a general purpose thread-safe hashed dictionary, has the best performance.
It is two times faster than TObjectDictionary when adding items, and also 30% faster when searching the content.
And has unique features, like binary or JSON serialization, search within nested arrays, and an optional built-in "timeout" feature (very convenient if you want to implement an in-memory cache of values).
The numbers for Delphi 7 (with our enhanced RTL) are even better:
2000000 TRawUTF8ListHashed.AddObject in 29.30ms (68259385/s)
TRawUTF8ListHashed.Rehash in 119.10ms
TRawUTF8ListHashed.Clear in 8.27ms
2000000 TRawUTF8ListHashed.AddObjectIfNotExisting in 458.42ms (4362801/s)
first TRawUTF8ListHashed.IndexOf in 1us
2000000 TRawUTF8ListHashed.IndexOf in 125.03ms (15995009/s)
2000000 TRawUTF8ListHashedLocked.AddObject in 78.51ms (25473488/s)
TRawUTF8ListHashedLocked.Rehash in 120.36ms
TRawUTF8ListHashedLocked.Clear in 7.31ms
2000000 TRawUTF8ListHashedLocked.AddObjectIfNotExisting in 524.67ms (3811898/s)
first TRawUTF8ListHashedLocked.LockedGetObjectByName in 0us
2000000 TRawUTF8ListHashedLocked.LockedGetObjectByName in 298.66ms (6696376/s)
2000000 TSynDictionary.Add in 339.58ms (5889506/s)
first TSynDictionary.FindAndCopy in 0us
2000000 TSynDictionary.FindAndCopy in 255.15ms (7838403/s)
Please check https://synopse.info/fossil/info/10f29366b5
]]>Why not just call Hash.Rehash ?
Not helps, since TRawUTF8ListHashed.Hash.Rehash not change TRawUTF8ListHashed.fChanged which is "true".
When call indexOf (first time) rehashing executes even if we called TRawUTF8ListHashed.Hash.Rehash previously.
Maybe better to move fChanged to TDynArrayHashed with proper handing..
]]>Would you like to add ReHash method directly to TRawUTF8ListHashed class?
TRawUTF8ListHashed = class(TRawUTF8List)
...
public
...
/// manual rehashing
function ReHash: boolean;
...
end
function TRawUTF8ListHashed.ReHash: boolean;
begin
result := fHash.ReHash;
fChanged := not result;
end;
20000000 TRawUTF8ListHashed.AddObject in 577.92ms (34606386/s)
first TRawUTF8ListHashed.IndexOf in 1.29s
20000000 TRawUTF8ListHashed.IndexOf in 1.52s (13128829/s)
20000000 TRawUTF8ListHashedLocked.AddObject in 986.40ms (20275585/s)
first TRawUTF8ListHashedLocked.LockedGetObjectByName in 1.32s
20000000 TRawUTF8ListHashedLocked.LockedGetObjectByName in 3.29s (6063919/s)
20000000 TSynDictionary.Add in 3.68s (5427181/s)
first TSynDictionary.FindAndCopy in 6us
20000000 TSynDictionary.FindAndCopy in 2.73s (7310692/s)
20000000 TObjectDictionary.Add in 6.66s (2999694/s)
first TObjectDictionary.TryGetValue in 0us
20000000 TObjectDictionary.TryGetValue in 2.81s (7115338/s)
with source code in https://pastebin.com/XEmiJ0af
Note that TSynDictionary is thread-safe, so has an overhead over TRawUTF8ListHashed, and is still faster than TObjectDictionary - which is NOT thread-safe.
If I use TRawUTF8ListHashedLocked, I'm closer to TSynDictionary numbers.
20000000 TRawUTF8ListHashed.AddObject in 1.07s (18518569/s)
first TRawUTF8ListHashed.IndexOf in 1.29s
20000000 TRawUTF8ListHashed.IndexOf in 4.01s (4982675/s)
20000000 TObjectDictionary.Add in 7.43s (2688466/s)
first TObjectDictionary.IndexOf in 0us
20000000 TObjectDictionary.IndexOf in 4.17s (4785796/s)
TObjectDictionary:
Add time = 812.96 ms
Search time = 15.74 ms
TRawUTF8ListHashed:
Add time = 194.22 ms
Search time = 6.13 ms
// Add time = 812.96 ms
// Search time = 15.74 ms
procedure TForm1.ButtonTObjectDictionaryClick(Sender: TObject);
var
Dictionary: TObjectDictionary<String, TTest>;
i: Integer;
pt: TPrecisionTimer;
Time: RawUTF8;
TestObj: TTest;
IdsForSearch: array of string;
begin
// Init
Dictionary := TObjectDictionary<String, TTest>.Create([doOwnsValues]);
SetLength(IdsForSearch, 100000);
for i := Low(IdsForSearch) to High(IdsForSearch) do
IdsForSearch[i] := IntToStr(i + 1500001);
pt.Init;
// Add values
pt.Start;
for i := 0 to 2000000 do
Dictionary.Add(i.ToString, TTest.Create(i));
Time := pt.Stop;
ShowMessage('add time: ' + UTF8ToString(Time));
// Perform search 100000 elements
pt.Start;
for i := Low(IdsForSearch) to High(IdsForSearch) do
Dictionary.TryGetValue(IdsForSearch[i], TestObj);
Time := pt.Stop;
if not Assigned(TestObj) or (IntToStr(TestObj.Id) <> IdsForSearch[High(IdsForSearch)]) then
ShowMessage('err');
ShowMessage('search value time: ' + UTF8ToString(Time));
Dictionary.Free;
end;
// Add time = 194.22 ms
// Search time = 6.13 ms
procedure TForm1.ButtonTRawUTF8ListHashedClick(Sender: TObject);
var
Dictionary: TRawUTF8ListHashed;
i: Integer;
pt: TPrecisionTimer;
Time: RawUTF8;
idx: int64;
IdsForSearch: array of RawUTF8;
begin
// Init
idx := -1;
Dictionary := TRawUTF8ListHashed.Create(True);
SetLength(IdsForSearch, 100000);
for i := Low(IdsForSearch) to High(IdsForSearch) do
IdsForSearch[i] := Int32ToUtf8(i + 1500001);
pt.Init;
// Add values
pt.Start;
for i := 0 to 2000000 do
Dictionary.AddObject(StringToUTF8(i.ToString), TTest.Create(i));
Time := pt.Stop;
ShowMessage('add time: ' + UTF8ToString(Time));
// Perform manual hashing
// Dictionary.Hash.ReHash; // no difference
Dictionary.IndexOf(IdsForSearch[0]); // to force hash calculations
// Perform search 100000 elements
pt.Start;
for i := low(IdsForSearch) to High(IdsForSearch) do
idx := Dictionary.IndexOf(IdsForSearch[i]);
Time := pt.Stop;
if (idx = -1) or (Int32ToUtf8(TTest(Dictionary.GetObject(idx)).Id) <> IdsForSearch[High(IdsForSearch)]) then
ShowMessage('err');
ShowMessage('search value time: ' + UTF8ToString(Time));
Dictionary.Free;
end;
procedure TForm1.FormCreate(Sender: TObject);
var
list: TRawUTF8ListHashed;
i, n: integer;
timer: TPrecisionTimer;
s: RawUTF8;
begin
list := TRawUTF8ListHashed.Create;
try
timer.Start;
n := 20000000;
for i := 1 to n do
list.AddObject(UInt32ToUtf8(i), pointer(i));
s := FormatUTF8('% AddObject in % (%/s)', [n, timer.Stop, timer.PerSec(n)]);
timer.Start;
list.IndexOf('1000');
s := FormatUTF8('%'#13#10'first IndexOf in %', [s, timer.Stop]);
timer.Start;
for i := 1 to n do
list.IndexOf(UInt32ToUtf8(i));
s := FormatUTF8('%'#13#10'% IndexOf in % (%/s)', [s, n, timer.Stop, timer.PerSec(n)]);
mmo1.Lines.Text := UTF8ToString(s);
finally
list.Free;
end;
end;
I have got under Delphi 7:
20000000 AddObject in 682.80ms (29291068/s)
first IndexOf in 1.26s
20000000 IndexOf in 3.00s (6660696/s)