#1 2013-10-26 21:41:41

louis_riviera
Member
Registered: 2013-09-23
Posts: 61

Fast TStringList

Is there a fast implementation of TStringList like object? Based on TStringBuilder or DynArray?

There is ofcourse TRawUTF8List but its about 100ms slower than regular TStringList.

Last edited by louis_riviera (2013-10-26 21:57:31)

Offline

#2 2013-10-27 07:15:11

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,612
Website

Re: Fast TStringList

What did you do for testing?
It will use a LStrAsg call when setting each value, so I expect that it works at same speed level than TStringList.
Did you try with latest updated from today?

I suppose TRawUTF8ListHashed will be much faster for searching, since it uses hashing.

Offline

#3 2013-10-27 11:15:28

louis_riviera
Member
Registered: 2013-09-23
Posts: 61

Re: Fast TStringList

Yes but i think an implementation based on THeapMemoryStream would be very very fast. Because TStringList has advantages over TStringBuilder but it consume a lot of memory!

Offline

#4 2014-07-11 12:48:47

Starkis
Member
From: Up in the space
Registered: 2011-01-16
Posts: 27

Re: Fast TStringList

so there wasn't fast TStringList implementation to share? wink


--- we no need no water, let the ... burn ---

Offline

#5 2014-07-12 11:45:50

rain
Member
Registered: 2014-07-12
Posts: 8

Re: Fast TStringList

Hi to all I'm new here,
I'm  using the alcinoe implementaions of TStringList: TALStringList and TALAVLStringList are quite fast. Look at xml parser is fast too.

http://sourceforge.net/projects/alcinoe/

I love the mORMot framework is a very good work, thank you A.B.


The mind is like a parachute, it only works if you open it.
Albert Einstein

Offline

#6 2014-07-12 14:42:11

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,612
Website

Re: Fast TStringList

@rain
Thanks for the link.
Yes, Alcinoe libraries are great open source!

AFAIK for XML, OXML is faster.
See http://www.kluug.net/oxml.php
And made by a long time mORMot user, I think!
smile

Offline

#7 2017-04-18 14:52:45

George
Member
Registered: 2016-04-05
Posts: 140

Re: Fast TStringList

Hello!

TObjectDictionary perform search faster than TRawUTF8ListHashed, while TRawUTF8ListHashed is faster for add operation.
Is it expected behavior?
Also, if max "i" value is increased up to 20 000 000, out of memory occurs with TRawUTF8ListHashed, while TObjectDictionary works.

// Add time = 2256 ms
// Search time = 0 ms
procedure TForm1.ButtonTObjectDictionaryClick(Sender: TObject);
var
  Dictionary: TObjectDictionary<String, TTest>;
  i: Integer;
  sw: TStopwatch;
  TestObj: TTest;
begin
  Dictionary := TObjectDictionary<String, TTest>.Create([doOwnsValues]);
  sw := TStopwatch.StartNew;
  for i := 0 to 6000000 do
    Dictionary.Add(i.ToString, TTest.Create(i));
  sw.Stop;
  ShowMessage('add time (ms): ' + sw.ElapsedMilliseconds.ToString);
  sw.Reset;
  sw.Start;
  if not Dictionary.TryGetValue('1500000', TestObj) or (TestObj.Id <> 1500000) then
    ShowMessage('err');
  sw.Stop;
  ShowMessage('search value time (ms): ' + sw.ElapsedMilliseconds.ToString);
  Dictionary.Free;
end;

// Add time = 809 ms
// Search time = 360 ms
procedure TForm1.ButtonTRawUTF8ListHashedClick(Sender: TObject);
var
  Dictionary: TRawUTF8ListHashed;
  i: Integer;
  sw: TStopwatch;
  idx: int64;
begin
  Dictionary := TRawUTF8ListHashed.Create(True);
  sw := TStopwatch.StartNew;
  for i := 0 to 6000000 do
    Dictionary.AddObject(i.ToString, TTest.Create(i));
  sw.Stop;
  ShowMessage('add time (ms): ' + sw.ElapsedMilliseconds.ToString);
  sw.Reset;
  sw.Start;
  idx := Dictionary.IndexOf('1500000');
  if (idx = -1) or (TTest(Dictionary.GetObject(idx)).Id <> 1500000) then
    ShowMessage('err');
  sw.Stop;
  ShowMessage('search value time (ms): ' + sw.ElapsedMilliseconds.ToString);
  Dictionary.Free;
end;

Last edited by George (2017-04-18 14:55:34)

Offline

#8 2017-04-18 16:56:45

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,612
Website

Re: Fast TStringList

Your code is making implicit (hidden) string <-> RawUTF8 conversions, so is biaised.

And searching for ONE value does not make sense.
TRawUTF8ListHashed won't hash the values unless you start for searching.
This is why it is slower, because it does all the hashing when you search for the first value.

Using TStopwatch for ONE value is just a joke. Use our TPrecisionTimer instead.

You are comparing oranges and apples.

Offline

#9 2017-04-18 18:21:14

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,612
Website

Re: Fast TStringList

If I try the following:

procedure TForm1.FormCreate(Sender: TObject);
var
  list: TRawUTF8ListHashed;
  i, n: integer;
  timer: TPrecisionTimer;
  s: RawUTF8;
begin
  list := TRawUTF8ListHashed.Create;
  try
    timer.Start;
    n := 20000000;
    for i := 1 to n do
      list.AddObject(UInt32ToUtf8(i), pointer(i));
    s := FormatUTF8('% AddObject in % (%/s)', [n, timer.Stop, timer.PerSec(n)]);
    timer.Start;
    list.IndexOf('1000');
    s := FormatUTF8('%'#13#10'first IndexOf in %', [s, timer.Stop]);
    timer.Start;
    for i := 1 to n do
      list.IndexOf(UInt32ToUtf8(i));
    s := FormatUTF8('%'#13#10'% IndexOf in % (%/s)', [s, n, timer.Stop, timer.PerSec(n)]);
    mmo1.Lines.Text := UTF8ToString(s);
  finally
    list.Free;
  end;
end;

I have got under Delphi 7:

20000000 AddObject in 682.80ms (29291068/s)
first IndexOf in 1.26s
20000000 IndexOf in 3.00s (6660696/s)

Offline

#10 2017-04-18 18:28:14

George
Member
Registered: 2016-04-05
Posts: 140

Re: Fast TStringList

Thanks, i've changed code.
There is no big overhead when searching 100k items smile
I've tried Dictionary.Hash.ReHash to make proper comparison (no difference).
Finally i've added first search outside of the counter and results now looks great:

TObjectDictionary:
Add time = 812.96 ms
Search time = 15.74 ms
TRawUTF8ListHashed:
Add time = 194.22 ms
Search time = 6.13 ms


// Add time = 812.96 ms
// Search time = 15.74 ms
procedure TForm1.ButtonTObjectDictionaryClick(Sender: TObject);
var
  Dictionary: TObjectDictionary<String, TTest>;
  i: Integer;
  pt: TPrecisionTimer;
  Time: RawUTF8;
  TestObj: TTest;
  IdsForSearch: array of string;
begin
  // Init
  Dictionary := TObjectDictionary<String, TTest>.Create([doOwnsValues]);
  SetLength(IdsForSearch, 100000);
  for i := Low(IdsForSearch) to High(IdsForSearch) do
    IdsForSearch[i] := IntToStr(i + 1500001);
  pt.Init;
  // Add values
  pt.Start;
  for i := 0 to 2000000 do
    Dictionary.Add(i.ToString, TTest.Create(i));
  Time := pt.Stop;
  ShowMessage('add time: ' + UTF8ToString(Time));
  // Perform search 100000 elements
  pt.Start;
  for i := Low(IdsForSearch) to High(IdsForSearch) do
    Dictionary.TryGetValue(IdsForSearch[i], TestObj);
  Time := pt.Stop;
  if not Assigned(TestObj) or (IntToStr(TestObj.Id) <> IdsForSearch[High(IdsForSearch)]) then
    ShowMessage('err');
  ShowMessage('search value time: ' + UTF8ToString(Time));
  Dictionary.Free;
end;

// Add time = 194.22 ms
// Search time = 6.13 ms
procedure TForm1.ButtonTRawUTF8ListHashedClick(Sender: TObject);
var
  Dictionary: TRawUTF8ListHashed;
  i: Integer;
  pt: TPrecisionTimer;
  Time: RawUTF8;
  idx: int64;
  IdsForSearch: array of RawUTF8;
begin
  // Init
  idx := -1;
  Dictionary := TRawUTF8ListHashed.Create(True);
  SetLength(IdsForSearch, 100000);
  for i := Low(IdsForSearch) to High(IdsForSearch) do
    IdsForSearch[i] := Int32ToUtf8(i + 1500001);
  pt.Init;
  // Add values
  pt.Start;
  for i := 0 to 2000000 do
    Dictionary.AddObject(StringToUTF8(i.ToString), TTest.Create(i));
  Time := pt.Stop;
  ShowMessage('add time: ' + UTF8ToString(Time));
  // Perform manual hashing
  // Dictionary.Hash.ReHash; // no difference
  Dictionary.IndexOf(IdsForSearch[0]); // to force hash calculations
  // Perform search 100000 elements
  pt.Start;
  for i := low(IdsForSearch) to High(IdsForSearch) do
    idx := Dictionary.IndexOf(IdsForSearch[i]);
  Time := pt.Stop;
  if (idx = -1) or (Int32ToUtf8(TTest(Dictionary.GetObject(idx)).Id) <> IdsForSearch[High(IdsForSearch)]) then
    ShowMessage('err');
  ShowMessage('search value time: ' + UTF8ToString(Time));
  Dictionary.Free;
end;

Last edited by George (2017-04-18 18:33:39)

Offline

#11 2017-04-18 18:33:15

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,612
Website

Re: Fast TStringList

and under Delphi 10.2 Tokyo:

20000000 TRawUTF8ListHashed.AddObject in 1.07s (18518569/s)
first TRawUTF8ListHashed.IndexOf in 1.29s
20000000 TRawUTF8ListHashed.IndexOf in 4.01s (4982675/s)

20000000 TObjectDictionary.Add in 7.43s (2688466/s)
first TObjectDictionary.IndexOf in 0us
20000000 TObjectDictionary.IndexOf in 4.17s (4785796/s)

Offline

#12 2017-04-18 19:16:05

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,612
Website

Re: Fast TStringList

I've included TSynDictionary and TRawUTF8ListHashedLocked:

20000000 TRawUTF8ListHashed.AddObject in 577.92ms (34606386/s)
first TRawUTF8ListHashed.IndexOf in 1.29s
20000000 TRawUTF8ListHashed.IndexOf in 1.52s (13128829/s)

20000000 TRawUTF8ListHashedLocked.AddObject in 986.40ms (20275585/s)
first TRawUTF8ListHashedLocked.LockedGetObjectByName in 1.32s
20000000 TRawUTF8ListHashedLocked.LockedGetObjectByName in 3.29s (6063919/s)

20000000 TSynDictionary.Add in 3.68s (5427181/s)
first TSynDictionary.FindAndCopy in 6us
20000000 TSynDictionary.FindAndCopy in 2.73s (7310692/s)

20000000 TObjectDictionary.Add in 6.66s (2999694/s)
first TObjectDictionary.TryGetValue in 0us
20000000 TObjectDictionary.TryGetValue in 2.81s (7115338/s)

with source code in https://pastebin.com/XEmiJ0af

Note that TSynDictionary is thread-safe, so has an overhead over TRawUTF8ListHashed, and is still faster than TObjectDictionary - which is NOT thread-safe.
If I use TRawUTF8ListHashedLocked, I'm closer to TSynDictionary numbers.

Offline

#13 2017-04-18 19:19:06

George
Member
Registered: 2016-04-05
Posts: 140

Re: Fast TStringList

Valuable information, thanks.

Would you like to add ReHash method directly to TRawUTF8ListHashed class?

TRawUTF8ListHashed = class(TRawUTF8List)
...
public
...
    /// manual rehashing
    function ReHash: boolean;
...
end

function TRawUTF8ListHashed.ReHash: boolean;
begin
  result := fHash.ReHash;
  fChanged := not result;
end;

Last edited by George (2017-04-18 19:23:50)

Offline

#14 2017-04-18 19:47:02

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,612
Website

Re: Fast TStringList

Why not just call Hash.Rehash ?

Offline

#15 2017-04-18 20:06:46

George
Member
Registered: 2016-04-05
Posts: 140

Re: Fast TStringList

Why not just call Hash.Rehash ?

Not helps, since TRawUTF8ListHashed.Hash.Rehash not change TRawUTF8ListHashed.fChanged which is "true".
When call indexOf (first time) rehashing executes even if we called TRawUTF8ListHashed.Hash.Rehash previously.

Maybe better to move fChanged to TDynArrayHashed with proper handing..

Last edited by George (2017-04-18 20:24:10)

Offline

#16 2017-04-19 08:13:20

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,612
Website

Re: Fast TStringList

Does make sense.

Please check https://synopse.info/fossil/info/10f29366b5

Offline

#17 2017-04-19 08:35:53

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,612
Website

Re: Fast TStringList

I've added some new tests to explicitly compare thread-safe process, and the new ReHash method.
See https://pastebin.com/M6RWRs7V

2000000 TRawUTF8ListHashed.AddObject in 52.74ms (37915410/s)
TRawUTF8ListHashed.Rehash in 121.10ms
TRawUTF8ListHashed.Clear in 16.97ms
2000000 TRawUTF8ListHashed.AddObjectIfNotExisting in 479.96ms (4166987/s)
first TRawUTF8ListHashed.IndexOf in 0us
2000000 TRawUTF8ListHashed.IndexOf in 127.96ms (15629517/s)

2000000 TRawUTF8ListHashedLocked.AddObject in 91.95ms (21750951/s)
TRawUTF8ListHashedLocked.Rehash in 122.47ms
TRawUTF8ListHashedLocked.Clear in 17.15ms
2000000 TRawUTF8ListHashedLocked.AddObjectIfNotExisting in 544.96ms (3669973/s)
first TRawUTF8ListHashedLocked.LockedGetObjectByName in 0us
2000000 TRawUTF8ListHashedLocked.LockedGetObjectByName in 294.85ms (6783041/s)

2000000 TSynDictionary.Add in 352.70ms (5670493/s)
first TSynDictionary.FindAndCopy in 3us
2000000 TSynDictionary.FindAndCopy in 256.23ms (7805365/s)

2000000 TObjectDictionary.Add in 681.99ms (2932555/s)
first TObjectDictionary.TryGetValue in 0us
2000000 TObjectDictionary.TryGetValue in 251.48ms (7952918/s)

2000000 TObjectDictionary.Add locked in 769.16ms (2600235/s)
first TObjectDictionary.TryGetValue in 0us
2000000 TObjectDictionary.TryGetValue locked in 352.53ms (5673276/s)

Sounds like if TSynDictionary, as a general purpose thread-safe hashed dictionary, has the best performance.
It is two times faster than TObjectDictionary when adding items, and also 30% faster when searching the content.
And has unique features, like binary or JSON serialization, search within nested arrays, and an optional built-in "timeout" feature (very convenient if you want to implement an in-memory cache of values).

The numbers for Delphi 7 (with our enhanced RTL) are even better:

2000000 TRawUTF8ListHashed.AddObject in 29.30ms (68259385/s)
TRawUTF8ListHashed.Rehash in 119.10ms
TRawUTF8ListHashed.Clear in 8.27ms
2000000 TRawUTF8ListHashed.AddObjectIfNotExisting in 458.42ms (4362801/s)
first TRawUTF8ListHashed.IndexOf in 1us
2000000 TRawUTF8ListHashed.IndexOf in 125.03ms (15995009/s)

2000000 TRawUTF8ListHashedLocked.AddObject in 78.51ms (25473488/s)
TRawUTF8ListHashedLocked.Rehash in 120.36ms
TRawUTF8ListHashedLocked.Clear in 7.31ms
2000000 TRawUTF8ListHashedLocked.AddObjectIfNotExisting in 524.67ms (3811898/s)
first TRawUTF8ListHashedLocked.LockedGetObjectByName in 0us
2000000 TRawUTF8ListHashedLocked.LockedGetObjectByName in 298.66ms (6696376/s)

2000000 TSynDictionary.Add in 339.58ms (5889506/s)
first TSynDictionary.FindAndCopy in 0us
2000000 TSynDictionary.FindAndCopy in 255.15ms (7838403/s)

Offline

#18 2017-04-19 20:43:21

George
Member
Registered: 2016-04-05
Posts: 140

Re: Fast TStringList

ab wrote:

Does make sense.
Please check https://synopse.info/fossil/info/10f29366b5

That's it, tnx.

Offline

#19 2017-04-25 16:13:01

turrican
Member
From: Barcelona
Registered: 2015-06-05
Posts: 94
Website

Re: Fast TStringList

How can be faster binary compiled with Delphi 7 than 10.1?

Offline

#20 2023-03-19 02:14:02

PolywickStudio
Member
Registered: 2023-03-19
Posts: 2

Re: Fast TStringList

I'm using Mormot 2

I'm getting slow performance in Mormot2

2000000 TRawUTF8ListHashed.AddObject in 88.52ms (22592487/s)
TRawUTF8ListHashed.Rehash in 1us
2000000 TRawUTF8ListHashed.AddObject in 88.52ms (22592487/s)
TRawUTF8ListHashed.Rehash in 1us
TRawUTF8ListHashed.Clear in 15.92ms
2000000 TRawUTF8ListHashed.AddObjectIfNotExisting in 89.07ms (22452484/s)
first TRawUTF8ListHashed.IndexOf in 5us
20000 TRawUTF8ListHashed.IndexOf in 752.33ms (26584/s)


2000000 TSynDictionary.Add in 510.91ms (3914537/s)
first TSynDictionary.FindAndCopy in 23us
2000000 TSynDictionary.FindAndCopy in 450.31ms (4441384/s)

2000000 TObjectDictionary.Add in 716.41ms (2791674/s)
first TObjectDictionary.TryGetValue in 1us
2000000 TObjectDictionary.TryGetValue in 211.21ms (9469114/s)

2000000 TObjectDictionary.Add locked in 628.33ms (3183000/s)
first TObjectDictionary.TryGetValue in 1us
2000000 TObjectDictionary.TryGetValue locked in 274.98ms (7273044/s)

Offline

#21 2023-03-19 02:15:35

PolywickStudio
Member
Registered: 2023-03-19
Posts: 2

Re: Fast TStringList

For TRawUTF8ListHashed.IndexOf, it's at 752.33ms ...

Then, TSynDictionary is a bit slower than TObjectDictionary...

What performance improvements are suggested?

Offline

#22 2023-03-19 16:42:36

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,612
Website

Re: Fast TStringList

In mORMot 2, TRawUtf8ListHashed is just an alias to TRawUtf8List so by default it is NOT hashing its entries.
It uses a O(n) full scan of the data, not an hashed index.
Use TRawUtf8List and ensure you set [fNoDuplicate] in TRawUtf8List.Create() as documented.

What kind of types are you storing in the TObjectDictionary / TSynDictionary?
TSynDictionary as some more features (like binary or JSON serialization), e.g. it is thread-safe by default: set TSynDictionary.ThreadUse := uNoLock for no locking.
Also consider IKeyValue<> from mormot.core.collections.

Offline

#23 2023-03-20 09:50:39

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,612
Website

Re: Fast TStringList

After more thoughts, our previous implementation was clearly confusing, especially when coming from mORMot 1, and even non-compatible when porting existing mORMot 1 applications.

So I have fixed the default behavior of TRawUtF8List to match mORMot 1 behavior.
But this is a BREAKING CHANGE: TRawUtf8List is not thread-safe by default.
https://github.com/synopse/mORMot2/commit/cfffde92

Now TRawUtf8ListHashed behave just the like as in mORMot 1, which is what is expected, and what confused you.

Offline

Board footer

Powered by FluxBB