You are not logged in.
Pages: 1
If subtraction is used alternatively (or in a some more advanced manner),
and that sum too used in the result, then your Hash32() would have
even less collisions? May be also a cumulative sum of s3 would be of use?
s3 := 0;
...
inc(s1,P^[0]);
inc(s3, P^[0]);
inc(s2,s1);
inc(s1,P^[1]);
dec(s3,P^[1]);
inc(s2,s1);
inc(s1,P^[2]);
inc(s3,P^[2]);
inc(s2,s1);
inc(s1,P^[3]);
dec(s3,P^[3]);
inc(s2,s1);
inc(PtrUInt(P),16);
In http://burtleburtle.net/bob/c/checksum.c , the author (Bob Jenkins)
returns 8 integers as checksum in hash(), so very less chance for collisions?.
Likewise, in your Hash32 (may be you have a better name like BouchezHash()),
if you return multiple integers (say s2,s3) , may be it becomes even less in collisions.
(I don't know why you would mix s2 and s1, losing the fine value of s2 in the result)
May I know the significance of "result := s1 xor (s2 shl 16);"
I think s2 is the one that avoids the collions more (as far as s1, s1 would be same
even if all the bytes were swapped in any combination). So, why not use s2 alone?
JFYI:
Bob Jenkins home page: http://burtleburtle.net/bob/
He also has a hash called lookup3 (http://burtleburtle.net/bob/c/lookup3.c),
64bit: http://burtleburtle.net/bob/c/lookup8.c
which he claims faster than crc32 (http://burtleburtle.net/bob/hash/examhash.html)
and claims 0 collisions.
His ISAAC has been ported in many languages including Pascal/Delphi
(http://home.netsurf.de/wolfgang.ehrhardt/misc_en.html) by
Wolfgang Ehrhardt another Delphi/Speed expert like you
(http://home.netsurf.de/wolfgang.ehrhardt/)
Offline
You've a good panel of algorithms at http://www.strchr.com/hash_functions
crc32 is the fastest and with less collisions (in the BYFOUR version) and its code is short (when you have the lookup tables).
If available, I'll use it instead in my code.
And I've put a kr32 function, which has much less collisions, as default hashing function for TDynArrayHashed.
Our hash32 is simple a DWORD version of the Adler32 algorithm.
I didn't want to reinvent the wheel, just have some hash at hand.
Of course, hash32 may be improved, but it's fairly basic.
Using a sub won't change much the colision rate.
We could get rid of the "mod" at the end of the calculation. But it was there in the Adler32 version...
Offline
Oh no! I shouldn't have underestimated your amount of research
Offline
crc32 is the fastest and with less collisions (in the BYFOUR version)
Do you know http://www.team5150.com/~andrew/noncryptohashzoo/? It seems CRC32 isn't the fastest and has also some issues (Avalanche, Seed Avalanche and Bit Independence Criterion Tests). But the "CrapWOW" Hash seems pretty good... MurMur3 isn't bad too.
Too bad, they didn't tested the Google Hash (dense_hash_map) from http://code.google.com/p/google-sparsehash
Greets
knight_killer
Offline
The source code used for testing crc32 was not the latest version included in the zlib.
For instance, the one included in our SynZip library is hand-tuned asm using 8 tables.
I'm not sure this "CrapWow" hashing is so quick by design.
It uses a multiplication for each DWORD, so it will probably slower than an optimized per-8 crc32 with precomputed tables or a good plain K&R's. This hash algo is just another variation of multiplicative hash function + shift. No new idea within.
Furthermore, I suspect the code is not re-entrant (whereas crc32 or adler32 can be called once or per chunk).
The supplied asm code is far from optimized itself (there is no good pipelining work within).
I prefer the http://www.strchr.com/hash_functions page for comparison.
I do not know dense_hash_map, but perhaps the "sparse table" trick does wonders with CPU cache size.
Offline
Offline
AB, do you see this string hashing algorithm - i'ts a 2012 year algorithm winner http://arxiv.org/abs/1202.4961 May be used in mORMot...
Offline
It should have less collisions than Hash32, but is more complex, also certainly slower.
They require a modern CPU.
I suspect they will be difficult to be implemented in non binary languages, like JavaScript.
Whereas I already have CRC32 + SHA256 in Smart/JavaScript.
See http://blog.synopse.info/post/2012/04/1 … JavaScript
Interresting, but perhaps not worth it now.
Offline
Pages: 1