You are not logged in.
Pages: 1
Is there a fast implementation of TStringList like object? Based on TStringBuilder or DynArray?
There is ofcourse TRawUTF8List but its about 100ms slower than regular TStringList.
Last edited by louis_riviera (2013-10-26 21:57:31)
Offline
What did you do for testing?
It will use a LStrAsg call when setting each value, so I expect that it works at same speed level than TStringList.
Did you try with latest updated from today?
I suppose TRawUTF8ListHashed will be much faster for searching, since it uses hashing.
Offline
Yes but i think an implementation based on THeapMemoryStream would be very very fast. Because TStringList has advantages over TStringBuilder but it consume a lot of memory!
Offline
so there wasn't fast TStringList implementation to share?
--- we no need no water, let the ... burn ---
Offline
Hi to all I'm new here,
I'm using the alcinoe implementaions of TStringList: TALStringList and TALAVLStringList are quite fast. Look at xml parser is fast too.
http://sourceforge.net/projects/alcinoe/
I love the mORMot framework is a very good work, thank you A.B.
The mind is like a parachute, it only works if you open it.
Albert Einstein
Offline
@rain
Thanks for the link.
Yes, Alcinoe libraries are great open source!
AFAIK for XML, OXML is faster.
See http://www.kluug.net/oxml.php
And made by a long time mORMot user, I think!
Offline
Hello!
TObjectDictionary perform search faster than TRawUTF8ListHashed, while TRawUTF8ListHashed is faster for add operation.
Is it expected behavior?
Also, if max "i" value is increased up to 20 000 000, out of memory occurs with TRawUTF8ListHashed, while TObjectDictionary works.
// Add time = 2256 ms
// Search time = 0 ms
procedure TForm1.ButtonTObjectDictionaryClick(Sender: TObject);
var
Dictionary: TObjectDictionary<String, TTest>;
i: Integer;
sw: TStopwatch;
TestObj: TTest;
begin
Dictionary := TObjectDictionary<String, TTest>.Create([doOwnsValues]);
sw := TStopwatch.StartNew;
for i := 0 to 6000000 do
Dictionary.Add(i.ToString, TTest.Create(i));
sw.Stop;
ShowMessage('add time (ms): ' + sw.ElapsedMilliseconds.ToString);
sw.Reset;
sw.Start;
if not Dictionary.TryGetValue('1500000', TestObj) or (TestObj.Id <> 1500000) then
ShowMessage('err');
sw.Stop;
ShowMessage('search value time (ms): ' + sw.ElapsedMilliseconds.ToString);
Dictionary.Free;
end;
// Add time = 809 ms
// Search time = 360 ms
procedure TForm1.ButtonTRawUTF8ListHashedClick(Sender: TObject);
var
Dictionary: TRawUTF8ListHashed;
i: Integer;
sw: TStopwatch;
idx: int64;
begin
Dictionary := TRawUTF8ListHashed.Create(True);
sw := TStopwatch.StartNew;
for i := 0 to 6000000 do
Dictionary.AddObject(i.ToString, TTest.Create(i));
sw.Stop;
ShowMessage('add time (ms): ' + sw.ElapsedMilliseconds.ToString);
sw.Reset;
sw.Start;
idx := Dictionary.IndexOf('1500000');
if (idx = -1) or (TTest(Dictionary.GetObject(idx)).Id <> 1500000) then
ShowMessage('err');
sw.Stop;
ShowMessage('search value time (ms): ' + sw.ElapsedMilliseconds.ToString);
Dictionary.Free;
end;
Last edited by George (2017-04-18 14:55:34)
Offline
Your code is making implicit (hidden) string <-> RawUTF8 conversions, so is biaised.
And searching for ONE value does not make sense.
TRawUTF8ListHashed won't hash the values unless you start for searching.
This is why it is slower, because it does all the hashing when you search for the first value.
Using TStopwatch for ONE value is just a joke. Use our TPrecisionTimer instead.
You are comparing oranges and apples.
Offline
If I try the following:
procedure TForm1.FormCreate(Sender: TObject);
var
list: TRawUTF8ListHashed;
i, n: integer;
timer: TPrecisionTimer;
s: RawUTF8;
begin
list := TRawUTF8ListHashed.Create;
try
timer.Start;
n := 20000000;
for i := 1 to n do
list.AddObject(UInt32ToUtf8(i), pointer(i));
s := FormatUTF8('% AddObject in % (%/s)', [n, timer.Stop, timer.PerSec(n)]);
timer.Start;
list.IndexOf('1000');
s := FormatUTF8('%'#13#10'first IndexOf in %', [s, timer.Stop]);
timer.Start;
for i := 1 to n do
list.IndexOf(UInt32ToUtf8(i));
s := FormatUTF8('%'#13#10'% IndexOf in % (%/s)', [s, n, timer.Stop, timer.PerSec(n)]);
mmo1.Lines.Text := UTF8ToString(s);
finally
list.Free;
end;
end;
I have got under Delphi 7:
20000000 AddObject in 682.80ms (29291068/s)
first IndexOf in 1.26s
20000000 IndexOf in 3.00s (6660696/s)
Offline
Thanks, i've changed code.
There is no big overhead when searching 100k items
I've tried Dictionary.Hash.ReHash to make proper comparison (no difference).
Finally i've added first search outside of the counter and results now looks great:
TObjectDictionary:
Add time = 812.96 ms
Search time = 15.74 ms
TRawUTF8ListHashed:
Add time = 194.22 ms
Search time = 6.13 ms
// Add time = 812.96 ms
// Search time = 15.74 ms
procedure TForm1.ButtonTObjectDictionaryClick(Sender: TObject);
var
Dictionary: TObjectDictionary<String, TTest>;
i: Integer;
pt: TPrecisionTimer;
Time: RawUTF8;
TestObj: TTest;
IdsForSearch: array of string;
begin
// Init
Dictionary := TObjectDictionary<String, TTest>.Create([doOwnsValues]);
SetLength(IdsForSearch, 100000);
for i := Low(IdsForSearch) to High(IdsForSearch) do
IdsForSearch[i] := IntToStr(i + 1500001);
pt.Init;
// Add values
pt.Start;
for i := 0 to 2000000 do
Dictionary.Add(i.ToString, TTest.Create(i));
Time := pt.Stop;
ShowMessage('add time: ' + UTF8ToString(Time));
// Perform search 100000 elements
pt.Start;
for i := Low(IdsForSearch) to High(IdsForSearch) do
Dictionary.TryGetValue(IdsForSearch[i], TestObj);
Time := pt.Stop;
if not Assigned(TestObj) or (IntToStr(TestObj.Id) <> IdsForSearch[High(IdsForSearch)]) then
ShowMessage('err');
ShowMessage('search value time: ' + UTF8ToString(Time));
Dictionary.Free;
end;
// Add time = 194.22 ms
// Search time = 6.13 ms
procedure TForm1.ButtonTRawUTF8ListHashedClick(Sender: TObject);
var
Dictionary: TRawUTF8ListHashed;
i: Integer;
pt: TPrecisionTimer;
Time: RawUTF8;
idx: int64;
IdsForSearch: array of RawUTF8;
begin
// Init
idx := -1;
Dictionary := TRawUTF8ListHashed.Create(True);
SetLength(IdsForSearch, 100000);
for i := Low(IdsForSearch) to High(IdsForSearch) do
IdsForSearch[i] := Int32ToUtf8(i + 1500001);
pt.Init;
// Add values
pt.Start;
for i := 0 to 2000000 do
Dictionary.AddObject(StringToUTF8(i.ToString), TTest.Create(i));
Time := pt.Stop;
ShowMessage('add time: ' + UTF8ToString(Time));
// Perform manual hashing
// Dictionary.Hash.ReHash; // no difference
Dictionary.IndexOf(IdsForSearch[0]); // to force hash calculations
// Perform search 100000 elements
pt.Start;
for i := low(IdsForSearch) to High(IdsForSearch) do
idx := Dictionary.IndexOf(IdsForSearch[i]);
Time := pt.Stop;
if (idx = -1) or (Int32ToUtf8(TTest(Dictionary.GetObject(idx)).Id) <> IdsForSearch[High(IdsForSearch)]) then
ShowMessage('err');
ShowMessage('search value time: ' + UTF8ToString(Time));
Dictionary.Free;
end;
Last edited by George (2017-04-18 18:33:39)
Offline
and under Delphi 10.2 Tokyo:
20000000 TRawUTF8ListHashed.AddObject in 1.07s (18518569/s)
first TRawUTF8ListHashed.IndexOf in 1.29s
20000000 TRawUTF8ListHashed.IndexOf in 4.01s (4982675/s)
20000000 TObjectDictionary.Add in 7.43s (2688466/s)
first TObjectDictionary.IndexOf in 0us
20000000 TObjectDictionary.IndexOf in 4.17s (4785796/s)
Offline
I've included TSynDictionary and TRawUTF8ListHashedLocked:
20000000 TRawUTF8ListHashed.AddObject in 577.92ms (34606386/s)
first TRawUTF8ListHashed.IndexOf in 1.29s
20000000 TRawUTF8ListHashed.IndexOf in 1.52s (13128829/s)
20000000 TRawUTF8ListHashedLocked.AddObject in 986.40ms (20275585/s)
first TRawUTF8ListHashedLocked.LockedGetObjectByName in 1.32s
20000000 TRawUTF8ListHashedLocked.LockedGetObjectByName in 3.29s (6063919/s)
20000000 TSynDictionary.Add in 3.68s (5427181/s)
first TSynDictionary.FindAndCopy in 6us
20000000 TSynDictionary.FindAndCopy in 2.73s (7310692/s)
20000000 TObjectDictionary.Add in 6.66s (2999694/s)
first TObjectDictionary.TryGetValue in 0us
20000000 TObjectDictionary.TryGetValue in 2.81s (7115338/s)
with source code in https://pastebin.com/XEmiJ0af
Note that TSynDictionary is thread-safe, so has an overhead over TRawUTF8ListHashed, and is still faster than TObjectDictionary - which is NOT thread-safe.
If I use TRawUTF8ListHashedLocked, I'm closer to TSynDictionary numbers.
Offline
Valuable information, thanks.
Would you like to add ReHash method directly to TRawUTF8ListHashed class?
TRawUTF8ListHashed = class(TRawUTF8List)
...
public
...
/// manual rehashing
function ReHash: boolean;
...
end
function TRawUTF8ListHashed.ReHash: boolean;
begin
result := fHash.ReHash;
fChanged := not result;
end;
Last edited by George (2017-04-18 19:23:50)
Offline
Why not just call Hash.Rehash ?
Not helps, since TRawUTF8ListHashed.Hash.Rehash not change TRawUTF8ListHashed.fChanged which is "true".
When call indexOf (first time) rehashing executes even if we called TRawUTF8ListHashed.Hash.Rehash previously.
Maybe better to move fChanged to TDynArrayHashed with proper handing..
Last edited by George (2017-04-18 20:24:10)
Offline
Does make sense.
Please check https://synopse.info/fossil/info/10f29366b5
Offline
I've added some new tests to explicitly compare thread-safe process, and the new ReHash method.
See https://pastebin.com/M6RWRs7V
2000000 TRawUTF8ListHashed.AddObject in 52.74ms (37915410/s)
TRawUTF8ListHashed.Rehash in 121.10ms
TRawUTF8ListHashed.Clear in 16.97ms
2000000 TRawUTF8ListHashed.AddObjectIfNotExisting in 479.96ms (4166987/s)
first TRawUTF8ListHashed.IndexOf in 0us
2000000 TRawUTF8ListHashed.IndexOf in 127.96ms (15629517/s)
2000000 TRawUTF8ListHashedLocked.AddObject in 91.95ms (21750951/s)
TRawUTF8ListHashedLocked.Rehash in 122.47ms
TRawUTF8ListHashedLocked.Clear in 17.15ms
2000000 TRawUTF8ListHashedLocked.AddObjectIfNotExisting in 544.96ms (3669973/s)
first TRawUTF8ListHashedLocked.LockedGetObjectByName in 0us
2000000 TRawUTF8ListHashedLocked.LockedGetObjectByName in 294.85ms (6783041/s)
2000000 TSynDictionary.Add in 352.70ms (5670493/s)
first TSynDictionary.FindAndCopy in 3us
2000000 TSynDictionary.FindAndCopy in 256.23ms (7805365/s)
2000000 TObjectDictionary.Add in 681.99ms (2932555/s)
first TObjectDictionary.TryGetValue in 0us
2000000 TObjectDictionary.TryGetValue in 251.48ms (7952918/s)
2000000 TObjectDictionary.Add locked in 769.16ms (2600235/s)
first TObjectDictionary.TryGetValue in 0us
2000000 TObjectDictionary.TryGetValue locked in 352.53ms (5673276/s)
Sounds like if TSynDictionary, as a general purpose thread-safe hashed dictionary, has the best performance.
It is two times faster than TObjectDictionary when adding items, and also 30% faster when searching the content.
And has unique features, like binary or JSON serialization, search within nested arrays, and an optional built-in "timeout" feature (very convenient if you want to implement an in-memory cache of values).
The numbers for Delphi 7 (with our enhanced RTL) are even better:
2000000 TRawUTF8ListHashed.AddObject in 29.30ms (68259385/s)
TRawUTF8ListHashed.Rehash in 119.10ms
TRawUTF8ListHashed.Clear in 8.27ms
2000000 TRawUTF8ListHashed.AddObjectIfNotExisting in 458.42ms (4362801/s)
first TRawUTF8ListHashed.IndexOf in 1us
2000000 TRawUTF8ListHashed.IndexOf in 125.03ms (15995009/s)
2000000 TRawUTF8ListHashedLocked.AddObject in 78.51ms (25473488/s)
TRawUTF8ListHashedLocked.Rehash in 120.36ms
TRawUTF8ListHashedLocked.Clear in 7.31ms
2000000 TRawUTF8ListHashedLocked.AddObjectIfNotExisting in 524.67ms (3811898/s)
first TRawUTF8ListHashedLocked.LockedGetObjectByName in 0us
2000000 TRawUTF8ListHashedLocked.LockedGetObjectByName in 298.66ms (6696376/s)
2000000 TSynDictionary.Add in 339.58ms (5889506/s)
first TSynDictionary.FindAndCopy in 0us
2000000 TSynDictionary.FindAndCopy in 255.15ms (7838403/s)
Offline
Does make sense.
Please check https://synopse.info/fossil/info/10f29366b5
That's it, tnx.
Offline
I'm using Mormot 2
I'm getting slow performance in Mormot2
2000000 TRawUTF8ListHashed.AddObject in 88.52ms (22592487/s)
TRawUTF8ListHashed.Rehash in 1us
2000000 TRawUTF8ListHashed.AddObject in 88.52ms (22592487/s)
TRawUTF8ListHashed.Rehash in 1us
TRawUTF8ListHashed.Clear in 15.92ms
2000000 TRawUTF8ListHashed.AddObjectIfNotExisting in 89.07ms (22452484/s)
first TRawUTF8ListHashed.IndexOf in 5us
20000 TRawUTF8ListHashed.IndexOf in 752.33ms (26584/s)
2000000 TSynDictionary.Add in 510.91ms (3914537/s)
first TSynDictionary.FindAndCopy in 23us
2000000 TSynDictionary.FindAndCopy in 450.31ms (4441384/s)
2000000 TObjectDictionary.Add in 716.41ms (2791674/s)
first TObjectDictionary.TryGetValue in 1us
2000000 TObjectDictionary.TryGetValue in 211.21ms (9469114/s)
2000000 TObjectDictionary.Add locked in 628.33ms (3183000/s)
first TObjectDictionary.TryGetValue in 1us
2000000 TObjectDictionary.TryGetValue locked in 274.98ms (7273044/s)
Offline
For TRawUTF8ListHashed.IndexOf, it's at 752.33ms ...
Then, TSynDictionary is a bit slower than TObjectDictionary...
What performance improvements are suggested?
Offline
In mORMot 2, TRawUtf8ListHashed is just an alias to TRawUtf8List so by default it is NOT hashing its entries.
It uses a O(n) full scan of the data, not an hashed index.
Use TRawUtf8List and ensure you set [fNoDuplicate] in TRawUtf8List.Create() as documented.
What kind of types are you storing in the TObjectDictionary / TSynDictionary?
TSynDictionary as some more features (like binary or JSON serialization), e.g. it is thread-safe by default: set TSynDictionary.ThreadUse := uNoLock for no locking.
Also consider IKeyValue<> from mormot.core.collections.
Offline
After more thoughts, our previous implementation was clearly confusing, especially when coming from mORMot 1, and even non-compatible when porting existing mORMot 1 applications.
So I have fixed the default behavior of TRawUtF8List to match mORMot 1 behavior.
But this is a BREAKING CHANGE: TRawUtf8List is not thread-safe by default.
https://github.com/synopse/mORMot2/commit/cfffde92
Now TRawUtf8ListHashed behave just the like as in mORMot 1, which is what is expected, and what confused you.
Offline
Pages: 1