#101 2010-12-20 07:12:24

andrewdynamo
Member
Registered: 2010-11-18
Posts: 65

Re: Delphi doesn't like multi-core CPUs (or the contrary)

Intel Core2 Quad core Q8300 @ 2.5Ghz

4B
1 = 47,83 nanoseconds per cycle
2 = 58,80 nanoseconds per cycle
4 = 124,68 nanoseconds per cycle
8 = 128,21 nanoseconds per cycle

8B
1 = 56,61 nanoseconds per cycle
2 = 73,01 nanoseconds per cycle
4 = 146,73 nanoseconds per cycle
8 = 146,73 nanoseconds per cycle

8BV
1 = 54,40 nanoseconds per cycle
2 = 75,75 nanoseconds per cycle
4 = 177,40 nanoseconds per cycle
8 = 222,44 nanoseconds per cycle

So 4B seems the fastest

Btw: I almost have my new ScaleMM algoritm ready, only need to fix some bugs (to have a working POC, fully working needs some more time)

Offline

#102 2010-12-22 10:05:08

TPrami
Member
Registered: 2010-07-06
Posts: 116

Re: Delphi doesn't like multi-core CPUs (or the contrary)

AdamWu wrote:

Please check out my latest revision:
http://code.google.com/p/delphi-toolbox … Free/?r=30

Could not compile, D6 don't open the project, dies in Access Violation (Something wrong in my installation). And D2007 gives internal error when trying to compile it sad So I can't compile and test it at all for now...

I try it at home, but if it would be possible to have compiled .exe, and some bat to run some standard "test suite", with various settings.

Maybe to modify the test program to write  Log on disk, to make posting into here easier...

-TP-

Offline

#103 2010-12-22 11:15:32

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,546
Website

Re: Delphi doesn't like multi-core CPUs (or the contrary)

I compiled it with Delphi 2010. Previous versions will have problems with the record semantic.

But I'm not on the computer on which I compiled it, so I can't upload the exe easily.

Offline

#104 2010-12-22 11:42:51

TPrami
Member
Registered: 2010-07-06
Posts: 116

Re: Delphi doesn't like multi-core CPUs (or the contrary)

ab wrote:

I compiled it with Delphi 2010. Previous versions will have problems with the record semantic.

I looked at it and it looked strange to me, but not sure a which point it was introduced...

I'll compile with D2010 at home...

-Tee-

Offline

#105 2010-12-23 06:13:47

AdamWu
Member
Registered: 2010-07-22
Posts: 20

Re: Delphi doesn't like multi-core CPUs (or the contrary)

TPrami wrote:

Could not compile, D6 don't open the project, dies in Access Violation (Something wrong in my installation). And D2007 gives internal error when trying to compile it sad So I can't compile and test it at all for now...

I try it at home, but if it would be possible to have compiled .exe, and some bat to run some standard "test suite", with various settings.

Maybe to modify the test program to write  Log on disk, to make posting into here easier...

-TP-

Sorry, I was traveling in the past few days... tongue

I have started learning lock-free code around late 2008, so I have never tested my code on 2007 and below. But it should work with 2009 and up.

Last edited by AdamWu (2010-12-23 06:15:31)

Offline

#106 2010-12-23 20:58:11

TPrami
Member
Registered: 2010-07-06
Posts: 116

Re: Delphi doesn't like multi-core CPUs (or the contrary)

D2010 compiled it just fine...

-TP-

Offline

#107 2010-12-29 11:47:41

andrewdynamo
Member
Registered: 2010-11-18
Posts: 65

Re: Delphi doesn't like multi-core CPUs (or the contrary)

I have a "working" POC of my newest algoritm:
http://code.google.com/p/scalemm/source … aleMM2.pas

It works completely different from the first version: it does not use preallocated blocks of one fixed size but
it does a dynamic allocation from one big block of 1Mb (like FastMM does with medium blocks). This way you have a lower
memory usage, because you use all memory of the 1mb block for all sizes. Downside of this approach is some more memory
fragmentation within the block, but I tried to reduce this by using an lookup index (bit array mask) by size, so a small alloc does not use the first available (big) mem but the tries to use the smallest as possible.

It is working for a small amount of allocs, but it has a nasty bug somewhere... However you can get an idea of the working how it should be. It is not optimized yet, but I tried to use "fast" techniques (shl and shr instead of div) in the base.
I hope I can remove some overhead somewhere (too much if statements etc).

Note: the bit scanning (also reverse bit scanning to get highest bit for the size), using of masks etc makes it less easy to follow (more "high tech" then version 1).

Don't know how fast it will be (for small allocs) in the end, maybe using ScaleMM v1 on top of version 2? :-)

Offline

#108 2010-12-30 05:34:08

TPrami
Member
Registered: 2010-07-06
Posts: 116

Re: Delphi doesn't like multi-core CPUs (or the contrary)

Good to hear that there is an progress.

I was just pondering how ScaleMM will work with large blocks.

Allocating and Disposing is locked, because of FastMM (etc) underneath, but how about using them, like copying smaller blocks for processing etc... Will that be lock free? I Suppose so... (Just to try to understand where this is currently)

I need ScaleMM for one server, but quite often it uses large blocks, they are owned by each thread (so no cross thread usage), but just been thinking that operations on those are blocked anyways with FastMM currently.

And other thing crossed my mind is that most likely the OmniThreadLibrary users would also gain from ScaleMM, if there are no too much depencies between the OTL and FMM...

-TP-

Offline

#109 2010-12-30 06:43:41

andrewdynamo
Member
Registered: 2010-11-18
Posts: 65

Re: Delphi doesn't like multi-core CPUs (or the contrary)

TPrami wrote:

Good to hear that there is an progress.
I was just pondering how ScaleMM will work with large blocks.
Allocating and Disposing is locked, because of FastMM (etc) underneath, but how about using them, like copying smaller blocks for processing etc... Will that be lock free? I Suppose so... (Just to try to understand where this is currently)
-TP-

Yes, ScaleMM version 1 works on top of FastMM, so large allocations (or when ScaleMM needs more mem) ar locked by FastMM. But all small blocks etc are processed per thread so no locking at all!

TPrami wrote:

I need ScaleMM for one server, but quite often it uses large blocks, they are owned by each thread (so no cross thread usage), but just been thinking that operations on those are blocked anyways with FastMM currently.
And other thing crossed my mind is that most likely the OmniThreadLibrary users would also gain from ScaleMM, if there are no too much depencies between the OTL and FMM...
-TP-

Because of the FastMM lock dependency I am making a large block allocator. This allocator is fully dynamic and it seems so good (?) it could also be used for small blocks. So I am thinking to use it as a complete allocator (no real difference between small or medium mem). Or if the speed is not good enough to use ScaleMM1 on top of ScaleMM2 :-).

I really would like to get rid of FastMM locking to get full scaling: this is the future (multi cores, OTL, AsyncCalls, etc)!

Btw: I hope I solved a nasty bug in ScaleMM2 yesterday evening, so I can test/develop it further.

Offline

#110 2011-01-03 08:18:05

andrewdynamo
Member
Registered: 2010-11-18
Posts: 65

Re: Delphi doesn't like multi-core CPUs (or the contrary)

First of all: a happy 2011! smile

Good news: V2 seems to work now. Not fully working (like interthread mem and it does not release mem to Windows).
Speed is about 3 times slower (30M allocs/reallocs/free in 3.1s, V1 does it in 1.2s) but there are enough optimizations possible.
http://code.google.com/p/scalemm/source … aleMM2.pas

Offline

#111 2011-01-03 13:57:11

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,546
Website

Re: Delphi doesn't like multi-core CPUs (or the contrary)

Happy new year!

I'll take a look at V2.
Nice work!

Offline

#112 2011-01-14 13:40:23

andrewdynamo
Member
Registered: 2010-11-18
Posts: 65

Re: Delphi doesn't like multi-core CPUs (or the contrary)

Version 2 almost working, ScaleMM1 works on top of v2 now.
(ScaleMM2 needs 16byte header + minimum alloc size of 32 bytes, so too much overhead for small blocks. ScaleMM1 is for small mem (<2kb) and ScaleMM2 for medium (<1Mb) and larger is direct VirtualAlloc/free)
http://code.google.com/p/scalemm/source … caleMM.pas
http://code.google.com/p/scalemm/source … aleMM2.pas

Some "small" problems needs to be fixed like backwards free block scanning, so you'll get "out of memory" in intensive tests.
And of course the necessary optimizations, cleanup, documentation etc.

Speed of ScaleMM1 is about 1100ms and ScaleMM2 about 2400ms (30M allocs/reallocs/free). So small memory allocs is faster than medium.

PS: ABA problem needs to be fixed too, will do this soon

Offline

#113 2011-01-14 13:44:34

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,546
Website

Re: Delphi doesn't like multi-core CPUs (or the contrary)

Nice step forward!

smile

Offline

#114 2011-01-20 08:13:05

Starkis
Member
From: Up in the space
Registered: 2011-01-16
Posts: 27

Re: Delphi doesn't like multi-core CPUs (or the contrary)

should SynScaleMM be used for single or small number (up to 3-5) thread appications? what guidelines would be to choose right from FastMM/ScaleMM/SynScaleMM? smile


--- we no need no water, let the ... burn ---

Offline

#115 2011-01-20 08:54:28

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,546
Website

Re: Delphi doesn't like multi-core CPUs (or the contrary)

Good question! smile

One important fact to notice is that SynScaleMM/ScaleMM will create a per-thread heap.

So it's definitively designed for use with some fixed number of threads.
Using a thread pool is a need for SynScaleMM/ScaleMM: if you don't have a thread pool, and create a lot of threads, each thread will create a new heap, which will be slow.

So here are some guidelines:
- FastMM4: If you use only one thread;
- FastMM4: If you are short in RAM (SynScaleMM/ScaleMM are more memory consuming);
- FastMM4: If you use some background threads which are not working continuously (e.g. a background thread for refreshing some data for some milliseconds, then free this thread, while the main thread deals with the UI);
- SynScaleMM or ScaleMM: when you use a server application with background threads running continuously in parallel - but WITHOUT a lot of thread creation (using e.g. a Thread Pool);
- FastMM4: when you use a server application with background threads running continuously in parallel - but WITH a lot of thread creation (no Thread Pool): in this case, consider reimplementing the server, using a Thread pool: this kind of architecture will be slower than SynScaleMM/ScaleMM + thread pool.

Of course, our framework is using a fixed number of threads:
- named pipe connections are designed to keep the connection alive as long as the client software is running: so there is only one thread created by client;
- HTTP/1.1 connections also handle keep alive connection, so one thread by client does make sense here;
- HTTP/1.0 connections are not kept alive, but use a thread pool via completion ports, so will perfectly fit with SynScaleMM/ScaleMM.

Offline

#116 2011-01-21 09:24:55

Starkis
Member
From: Up in the space
Registered: 2011-01-16
Posts: 27

Re: Delphi doesn't like multi-core CPUs (or the contrary)

thanks for the clarification - it is helpfull for the non-gurus of this field smile


--- we no need no water, let the ... burn ---

Offline

#117 2011-01-24 08:11:28

andrewdynamo
Member
Registered: 2010-11-18
Posts: 65

Re: Delphi doesn't like multi-core CPUs (or the contrary)

ScaleMM is faster than FastMM, so also good for single thread.
ScaleMM also caches the thread managers, so should also be good with many short lived threads.
ScaleMM only uses more mem then FastMM, so do not use it if you are low on memory (btw: FastMM also uses more mem then low level Windows mem)

ScaleMM2 is almost ready, busy with the last details. It works without FastMM so faster in multithreaded (no FastMM locks underneath), and it will have special medium block algoritm, and uses direct virtualalloc for large mem (> 1Mb) but a special large block handling can be easily made in case someone uses lots of large blocks.

Offline

#118 2011-01-24 09:15:55

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,546
Website

Re: Delphi doesn't like multi-core CPUs (or the contrary)

In my tests, ScaleMM is slower than FastMM for single thread applications...

About the thread manager caching, you're right: I forgot about it! wink

Thanks for the good news about ScaleMM2.
Have you any preliminary benchmarks?

Offline

#119 2011-01-25 08:41:42

andrewdynamo
Member
Registered: 2010-11-18
Posts: 65

Re: Delphi doesn't like multi-core CPUs (or the contrary)

30M alloc/realloc/free (small mem, 10 - 120bytes), 1 thread
FastMM    = 4376ms
ScaleMM2 = 1651 (still too high, can be optimized further, ScaleMM1 has about 1100ms)

30M alloc/realloc/free (medium mem, 10kb - 80kb), 1 thread
FastMM    = 58206ms (!), with no resize (+10bytes) = 2302ms
ScaleMM2 = 2326ms

for j := 0 to 1000 do
  for i := 0 to 10000 do
      p1 := GetMemory(10);
      p2 := GetMemory(40);
      p3 := GetMemory(80);
      p1 := ReallocMemory(p1, 30);
      p2 := ReallocMemory(p2, 60);
      p3 := ReallocMemory(p3, 120);
      FreeMemory(p1);
      FreeMemory(p2);
      FreeMemory(p3);

for j := 0 to 1000 do
  for i := 0 to 10000 do
      p1 := GetMemory(10 * 1024);
      p2 := GetMemory(40 * 1024);
      p3 := GetMemory(80 * 1024);
      p1 := ReallocMemory(p1, 10 * 1024 + 10);
      p2 := ReallocMemory(p2, 40 * 1024 + 100);
      p3 := ReallocMemory(p3, 80 * 1024 + 10);
      FreeMemory(p1);
      FreeMemory(p2);
      FreeMemory(p3);

Offline

#120 2011-01-25 10:06:41

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,546
Website

Re: Delphi doesn't like multi-core CPUs (or the contrary)

This test is a bit not realistic.
You're freeing the just allocated memory.
This is really a "best-case", which doesn't reflect the reality of memory allocation in an application.

Or perhaps there are some missing begin...end in your above code !!!
wink

My tests were with running the whole unitary test benchmark of our framework, using one MM or the other.
And FastMM4 made (a little bit) better results than ScaleMM.

Offline

#121 2011-01-25 10:19:29

andrewdynamo
Member
Registered: 2010-11-18
Posts: 65

Re: Delphi doesn't like multi-core CPUs (or the contrary)

yeah, you must put begin/end around the for loop :-)

And yes, it is a simple test to test the "core" speed (only memory operations, nothing more)

In real life the results will be different. The FastCodeMMChallenge also showed slower ScaleMM in some cases
(because it works on top of FastMM: mem larger than 2kb is passed to FastMM, so slightly slower because of
the ScaleMM size check overhead). I hope ScaleMM2 won't have this limitation smile. I have still one (?) tiny bug so cannot
run the full FastCodeMMChallenge yet...

Btw: alpha version is in source control, ScaleMM1 in seperate branch

Last edited by andrewdynamo (2011-01-25 10:20:30)

Offline

#122 2011-01-27 09:26:44

andrewdynamo
Member
Registered: 2010-11-18
Posts: 65

Re: Delphi doesn't like multi-core CPUs (or the contrary)

Latest version seems to work OK now (added medium mem CheckMem functions: found a couple of nasty bugs with it!)
Also unit test added.
http://code.google.com/p/scalemm/source/browse/trunk

It only supports mem < 2Gb and no interthread mem support and no mem leak support yet.
And some more optimizations and cleanup needed...

Offline

#123 2011-01-27 11:27:51

mai62
Member
Registered: 2010-07-01
Posts: 5

Re: Delphi doesn't like multi-core CPUs (or the contrary)

Delphi 2007:
[DCC Error] smmFunctions.pas(96): E2107 Operand size mismatch
[DCC Error] smmFunctions.pas(119): E2107 Operand size mismatch
[DCC Error] smmLargeMemory.pas(45): F2063 Could not compile used unit 'smmFunctions.pas'

[ 96]  lock cmpxchg dword ptr [aDestination], aNewValue
...
[119]  BSF  EAX, aValue;

Offline

#124 2011-01-27 12:25:47

andrewdynamo
Member
Registered: 2010-11-18
Posts: 65

Re: Delphi doesn't like multi-core CPUs (or the contrary)

mai62 wrote:

Delphi 2007:

Thanks for reporting, I only checked it for D2010, will fix this tomorrow I think (after D2010 is completely tested)

Offline

#125 2011-01-28 09:22:40

andrewdynamo
Member
Registered: 2010-11-18
Posts: 65

Re: Delphi doesn't like multi-core CPUs (or the contrary)

Compiles and runs fine on D2007 too now
Extra checks added, new extensive test running and going fine so far

Offline

#126 2011-01-28 09:39:04

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,546
Website

Re: Delphi doesn't like multi-core CPUs (or the contrary)

Great!

What about the resulting performances?

Offline

#127 2011-01-28 14:34:13

andrewdynamo
Member
Registered: 2010-11-18
Posts: 65

Re: Delphi doesn't like multi-core CPUs (or the contrary)

hmmm, FastCodeMMChallenge depends on big mem realloc (which is partly implemented) so it is not as fast as is should be (in that benchmark). So I need to use the same big mem realloc algoritm as FastMM I think (increment in steps of 64k instead of 1byte :-) and use VirtualQuery to expend virtualmem)

Offline

#128 2011-01-31 09:30:06

andrewdynamo
Member
Registered: 2010-11-18
Posts: 65

Re: Delphi doesn't like multi-core CPUs (or the contrary)

Checked in better large mem handling, now ScaleMM2 is 20% faster in FastCodeMMChallenge :-)

Average Speed Performance: (Scaled so that the winner = 100%)
  DelphiInternal :   79,7
  ScaleMem    :  100,0

Still some improvements possible, because some test results are too bad (will investigate and "fix" them)

Offline

#129 2011-01-31 13:41:14

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,546
Website

Re: Delphi doesn't like multi-core CPUs (or the contrary)

Very nice results!

Keep the good work!

Offline

#130 2011-02-01 15:01:45

andrewdynamo
Member
Registered: 2010-11-18
Posts: 65

Re: Delphi doesn't like multi-core CPUs (or the contrary)

Some intermediate detailed benchmark:
http://code.google.com/p/scalemm/#Benchmark_2

It is slower in small reallocations (with no change, so only "function" overhead), but overall it is 25% faster now.
Need to change slow "owner" determination (small, medium, large), will change it to use same kind of logic as FastMM does
(use lowest or highest bits of "size" to mark for free and size type)

Offline

#131 2011-02-01 16:16:37

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,546
Website

Re: Delphi doesn't like multi-core CPUs (or the contrary)

In this benchmark 2, the multi-threaded tests (with Nexus DB or 8 threads) results are not better than FastMM4, am I wrong?
sad

Or is it the contrary? I didn't get the % thing... higher the better or lower the better?
smile

Offline

#132 2011-02-02 06:48:52

andrewdynamo
Member
Registered: 2010-11-18
Posts: 65

Re: Delphi doesn't like multi-core CPUs (or the contrary)

Sorry, it was a quick test smile
FastMM is 100% (time), so anything above is bad (like the first test), anything below is better (less is less time, so faster, is better smile )

Offline

#133 2011-02-02 07:31:06

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,546
Website

Re: Delphi doesn't like multi-core CPUs (or the contrary)

So the benchmark results are very good on multi-threaded applications.
That was the purpose of this "scaling" memory manager.

Great!!! smile

Offline

#134 2011-02-02 08:37:47

andrewdynamo
Member
Registered: 2010-11-18
Posts: 65

Re: Delphi doesn't like multi-core CPUs (or the contrary)

Yes, but I want it to be overall faster sad
It must be a complete replacement, so also fast in single threaded. If possible smile.

Offline

#135 2011-02-04 15:07:46

andrewdynamo
Member
Registered: 2010-11-18
Posts: 65

Re: Delphi doesn't like multi-core CPUs (or the contrary)

I have some slow performance in my code:
https://picasaweb.google.com/lh/photo/G … directlink
https://picasaweb.google.com/lh/photo/Q … directlink

Why are these lines below so slow? (more than 100 CPU cycles)
    if NativeUInt(pm.OwnerBlock) and 1 <> 0 then 
and:
    if ot = PBaseThreadMemory(FCacheSmallMemManager) then

Maybe because of L1/L2 cache fetch? (FCacheSmallMemManager is located in current object, maybe needs to fetch it?).
How can this be optimized?
(these 2 lines consumes most of the time)

========================================

function TThreadMemManager.ReallocMem(aMemory: Pointer;
  aSize: NativeUInt): Pointer;
var
  pm: PBaseMemHeader;
  ot: PBaseThreadMemory;
begin
  if FOtherThreadFreedMemory <> nil then
    ProcessFreedMemFromOtherThreads;

  pm := PBaseMemHeader(NativeUInt(aMemory) - SizeOf(TBaseMemHeader));
  //check realloc of freed mem
  if (pm.Size and 1 = 0) then
  begin
    if NativeUInt(pm.OwnerBlock) and 1 <> 0 then 
//lowest bit is mark bit: medium mem has ownerthread instead of ownerblock (temp. optimization trial)
//otherwise slow L1/L2 fetch needed in case of "large" distance
    begin
      ot := PBaseThreadMemory( NativeUInt(pm.OwnerBlock) and -2);  //clear lowest bit
      if ot = PBaseThreadMemory(FCacheMediumMemManager) then
        Result := FCacheMediumMemManager.ReallocMem(aMemory, aSize)
      else
        Result := ReallocMemOfOtherThread(aMemory, aSize);
    end
    else
    begin
      ot := pm.OwnerBlock.OwnerThread;

      if ot = PBaseThreadMemory(FCacheSmallMemManager) then
        Result := FCacheSmallMemManager.ReallocMem(aMemory, aSize)
  //    else if ot = PBaseThreadMemory(FCacheMediumMemManager) then
  //      Result := FCacheMediumMemManager.ReallocMem(aMemory, aSize)
      else if ot = PBaseThreadMemory(FCacheLargeMemManager) then
        Result := FCacheLargeMemManager.ReallocMemWithHeader(aMemory, aSize)
      else
        Result := ReallocMemOfOtherThread(aMemory, aSize);
    end
  end
  else
    Error(reInvalidPtr);
end;

Last edited by andrewdynamo (2011-02-04 15:11:04)

Offline

#136 2011-02-05 21:15:54

andrewdynamo
Member
Registered: 2010-11-18
Posts: 65

Re: Delphi doesn't like multi-core CPUs (or the contrary)

Hmm, high number of resource stalls, L1 load blocked by stores, etc:
https://picasaweb.google.com/lh/photo/q … directlink

Maybe memory is too good aligned? L1 has 4k aliasing, so first 4k must differ?

Offline

#137 2011-02-08 18:46:45

andrewdynamo
Member
Registered: 2010-11-18
Posts: 65

Re: Delphi doesn't like multi-core CPUs (or the contrary)

Status update: Tried some offsets to disalign, but no succes. I think I just must prevent too much "lookups" (needs L1/L2 cache fetches, or worse: memory fetch).
Instead of nice structure and good "code smell" I'll have to use speed hacks and/or more (packed) data in the header for intelligent reallocs (I must determine the type: small, medium and large, and even worse: check the thread owner). I have some ideas for it, but this need some restructure...

I have already made a simple pre-realloc check: if new size is smaller but greater than 1/4 of current size, nothing has to be done (no thread owner or type check). Now a realloc test is only slightly slower than (asm optimized!) FastMM/D2010.

Offline

#138 2011-03-16 14:35:42

cstuffer
Member
Registered: 2010-07-21
Posts: 11

Re: Delphi doesn't like multi-core CPUs (or the contrary)

Hello,

It's look like ScaleMM2 is compatible with Delphi 2007 and up only.
At least, it's does not compile with Delphi 7.

Btw,
Which is better,  Delphi 7 or Delphi 2007 ?

Carl

Offline

#139 2011-03-17 07:08:14

andrewdynamo
Member
Registered: 2010-11-18
Posts: 65

Re: Delphi doesn't like multi-core CPUs (or the contrary)

cstuffer wrote:

Hello,
It's look like ScaleMM2 is compatible with Delphi 2007 and up only.
At least, it's does not compile with Delphi 7.
Carl

That is possible
I only have D2010 so I made it primary for that. But "ab" made some changes to make ScaleMM1 compatible with D7, something with changing "record" to "object" so you could try that first? What kind of compiler errors do you get next?

Offline

#140 2011-03-17 07:56:59

cstuffer
Member
Registered: 2010-07-21
Posts: 11

Re: Delphi doesn't like multi-core CPUs (or the contrary)

Hello,

Thank you for your reply smile

ScaleMM1 as well as SynScaleMM are working very fine with D7

For ScaleMM2,
the incompatilibities are mainly about "how" type declarations are done.
They are not runtime errors, it just not compile.

If you want, i can give you the list of errors generated in the IDE.

Carl

Offline

#141 2011-03-17 10:40:48

andrewdynamo
Member
Registered: 2010-11-18
Posts: 65

Re: Delphi doesn't like multi-core CPUs (or the contrary)

I fixed it (in a D7 portable edition :-) ) by changing "record" to "object"
(committed in SVN now)

Last edited by andrewdynamo (2011-03-17 13:18:48)

Offline

#142 2011-03-17 21:29:48

cstuffer
Member
Registered: 2010-07-21
Posts: 11

Re: Delphi doesn't like multi-core CPUs (or the contrary)

Hello,

Thank you smile

It compile and work very fine.

I notice that Small, Medium and Large memory are supported.
Does this mean we can remove FastMM ??

In any case, with FastMM as first modulem i got my application to freeze when i closed a form.
After removing FastMM, all is working fine.

Carl

* Edited 2011-03-18
* Just as a note, i am using it on a very big project.
* And, rarely and randomly, i am getting to program to freeze with ScaleMM2.
* Not always at the same place.

Last edited by cstuffer (2011-03-18 12:37:18)

Offline

#143 2011-03-21 07:20:30

andrewdynamo
Member
Registered: 2010-11-18
Posts: 65

Re: Delphi doesn't like multi-core CPUs (or the contrary)

cstuffer wrote:

I notice that Small, Medium and Large memory are supported.
Does this mean we can remove FastMM ??

Yes and no
Because it seems ScaleMM2 does not pass the validation tests of FastCode Benchmark:
https://forums.embarcadero.com/thread.j … 529#332529
so I should not use it in production yet

cstuffer wrote:

* And, rarely and randomly, i am getting to program to freeze with ScaleMM2.

Freeze as in 100% CPU (loop) or 0% CPU (deadlock)?

Can you make a minidump of your app (with debug info like a .map file)?
http://code.google.com/p/asmprofiler/so … nidump.pas

Offline

#144 2011-04-07 09:40:30

Yegor
Member
Registered: 2011-04-07
Posts: 2

Re: Delphi doesn't like multi-core CPUs (or the contrary)

Hi folks,

ab wrote:

2. string types and dynamic arrays just use the same LOCKed asm instruction everywhere, i.e. for every access which may lead into a write to the string.

I want to express my vision on the problem with string types and dynamic arrays. I believe everybody understands that dynamic arrays and strings are NOT actually thread safe (see an example below that leads to AV) and you have to synchronize an access to variables of such types on your own. If so, Embarcadero has to just mention this in documentation (Strings and Dyn Arrays are not tread safe) and remove "LOCK" instruction from ref counters as AB made in his workaround.

Am I missing something?

Regards,
Yegor

var
  s: string;

procedure Writer1(p: Pointer); stdcall;
var
  i: Integer;
begin
  for i := 1 to 1000000 do
    s := IntToStr(i);
end;

procedure Writer2(p: Pointer); stdcall;
var
  i: Integer;
begin
  for i := 1 to 1000000 do
    s := s + IntToStr(i);
end;

begin
  BeginThread(@Writer1);
  BeginThread(@Writer2);
  ReadLn;
end.

Offline

#145 2011-04-07 11:30:56

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,546
Website

Re: Delphi doesn't like multi-core CPUs (or the contrary)

The thread-safeness of reference counted variable were never told to be 100% effective for both read and write.

In short, reading is thread-safe (thanks to the LOCK asm used), but writing such variables at the same time will fail, as your sample code demonstrates.

See this reference post:

Yorai Aminov (TeamB) wrote:

From: "Yorai Aminov (TeamB)" <yaminov@nospam.trendline.co.il>
Subject: Re: Are Delphi Strings Thread Safe
Date: 18 Dec 1999 00:00:00 GMT
Newsgroups: borland.public.delphi.vcl.components.using

On Sat, 18 Dec 1999 05:21:26 +0100, "Kenneth Ellested"
<ke@jydsk-data.dk> wrote:

>I have always wondered how delphi dynamic strings are "implemented" by the
>compiler (in details). Will it be possible to have two threads operating on
>the same string ?

Yes. Generally speaking, Delphi 5 strings are thread safe, but.
strings in earlier versions were not. This does not mean any action
performed on a string can be done without the proper synchronization,
though.

Write access to strings across threads should always be protected by a
critical section or a similar mechanism. This is true for any type,
not only strings. The problem with earlier versions of Delphi was that
reading a string from two threads at the same time was not safe. This
has been fixed in Delphi 5. You can find several lengthy discussions
of this in the .objectpascal group.

AFAIK this is the official behavior as described by Borland/Embarcadero.

Offline

#146 2011-06-17 05:16:39

TPrami
Member
Registered: 2010-07-06
Posts: 116

Re: Delphi doesn't like multi-core CPUs (or the contrary)

Hello Hello,

No talk about the ScaleMM or SynScaleMM for some time now.

What is the status of each of the memory managers?

(HAve not seen repository changes of ScaleMM for ages also)

-Tee-

Offline

#147 2011-06-17 06:49:27

andrewdynamo
Member
Registered: 2010-11-18
Posts: 65

Re: Delphi doesn't like multi-core CPUs (or the contrary)

TPrami wrote:

No talk about the ScaleMM or SynScaleMM for some time now.
What is the status of each of the memory managers?
(Have not seen repository changes of ScaleMM for ages also)
-Tee-

Yes that's true...

I've been busy for some time to fix the "inter thread memory" problem, but I could not get it 100% threadsafe.
Also the structure has too much overhead and too complicated (so not easy to find bugs). Furthermore I'm busy
with testing the next version of some piece of software.

But (as coincidence does not exists :-) ) I did some preliminary tests on ScaleMM3 yesterday evening. Results so
far seems good (twice as fast as internal D2010 FastMM). Still struggling with interthread memory: I don't want locks
(makes it double slower when I use InterlockedExchangeAdd for example) but also don't want sending a list of "otherthread
memory" to the owner thread. So I'm thinking about some kind of "relaxed" scanning for freed memory by an other thread
when no memory is available in a block and the block has a "interthread mark" (so 90% of the time it is fast, once in a while bit slower: better than extensive/exact administration overhead EVERY time).

Some more background info: I use 4k pages with "no" header: minimal information for each page is stored in an array at begin of 1Mb block. This should reduce memory fetches/cache invalidation for previous/next page checks (this made ScaleMM2 slow) because the array is one mem (no big gaps between memory reads).
I'm also thinking about some kind of carousel of 8 memory blocks, so when you do 8 contiguous mem request they are more spreaded to eliminate "false sharing": http://delphitools.info/2011/05/31/a-fi … tmonitors/

Offline

#148 2011-06-17 14:28:58

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,546
Website

Re: Delphi doesn't like multi-core CPUs (or the contrary)

Is this new approach not too complex?

Offline

#149 2011-06-17 19:00:51

andrewdynamo
Member
Registered: 2010-11-18
Posts: 65

Re: Delphi doesn't like multi-core CPUs (or the contrary)

ab wrote:

Is this new approach not too complex?

No, in fact it is much easier/simpler than ScaleMM2 (I did not like the complexity of ScaleMM2)

Offline

#150 2011-09-22 11:06:04

TPrami
Member
Registered: 2010-07-06
Posts: 116

Re: Delphi doesn't like multi-core CPUs (or the contrary)

andrewdynamo wrote:

No, in fact it is much easier/simpler than ScaleMM2 (I did not like the complexity of ScaleMM2)

When we can test drive it wink

.-Tee-.

Offline

Board footer

Powered by FluxBB