Overview
Examples
Screenshots
Comparisons
Applications
Download
Documentation
Tutorials
Bazaar
Status & Roadmap
FAQ
Authors & License
Forums
Funding Ultimate++
Search on this site
Search in forums












SourceForge.net Logo
Home » Community » Coffee corner » Map implementation
Map implementation [message #51398] Thu, 21 March 2019 16:32 Go to next message
cbpporter is currently offline  cbpporter
Messages: 1400
Registered: September 2007
Senior Contributor
I have implemented plenty of stuff since I'm a programmer, including many containers and even multiple small GUI toolkits (not as good as U++ of course Laughing) but I never implemented a hash maps, let alone one that is accessibly as an array like in U++.

Can I shamelessly dissect and steal VectorMap? Razz

I'm currently studying HashBase, trying to figure out how it works and how the masking process works. Then Index...

Or are you aware of some other easy to learn and re-implement version of hash maps out there that will also perform more than adequately?

And how would you compare the U++ version to a more "standard" one?

Thanks!
Re: Map implementation [message #51403 is a reply to message #51398] Fri, 22 March 2019 07:18 Go to previous messageGo to next message
mirek is currently offline  mirek
Messages: 12098
Registered: November 2005
Ultimate Member
cbpporter wrote on Thu, 21 March 2019 16:32
I have implemented plenty of stuff since I'm a programmer, including many containers and even multiple small GUI toolkits (not as good as U++ of course Laughing) but I never implemented a hash maps, let alone one that is accessibly as an array like in U++.

Can I shamelessly dissect and steal VectorMap? Razz


It is opensource, is not it? And I have not applied for any patents.... Smile

Quote:

Or are you aware of some other easy to learn and re-implement version of hash maps out there that will also perform more than adequately?

And how would you compare the U++ version to a more "standard" one?


Compared to standard one, it is "alien technology" Smile

Mirek
Re: Map implementation [message #51456 is a reply to message #51398] Wed, 27 March 2019 03:33 Go to previous messageGo to next message
Novo is currently offline  Novo
Messages: 890
Registered: December 2006
Experienced Contributor
cbpporter wrote on Thu, 21 March 2019 11:32

Or are you aware of some other easy to learn and re-implement version of hash maps out there that will also perform more than adequately?

Useful links:
https://en.wikipedia.org/wiki/Open_addressing
https://probablydance.com/2017/02/26/i-wrote-the-fastest-has htable/
http://www.idryman.org/blog/2017/05/03/writing-a-damn-fast-h ash-table-with-tiny-memory-footprints/
https://aras-p.info/blog/2016/08/02/Hash-Functions-all-the-w ay-down/
https://preshing.com/20160201/new-concurrent-hash-maps-for-c pp/
http://szelei.me/constexpr-murmurhash/
https://opensource.googleblog.com/2014/03/introducing-farmha sh.html

Hope this helps. Smile


Regards,
Novo
Re: Map implementation [message #51487 is a reply to message #51456] Tue, 02 April 2019 14:13 Go to previous messageGo to next message
cbpporter is currently offline  cbpporter
Messages: 1400
Registered: September 2007
Senior Contributor
Thank you very much!

There is a lot to go through...
Re: Map implementation [message #51488 is a reply to message #51487] Tue, 02 April 2019 17:48 Go to previous messageGo to next message
Novo is currently offline  Novo
Messages: 890
Registered: December 2006
Experienced Contributor
cbpporter wrote on Tue, 02 April 2019 08:13

There is a lot to go through...

This is actually not that hard. All aspects of the hash table design are very well described in "I Wrote The Fastest Hashtable". You just need to choose your design. The easiest way to start with is "Powers of Two", linear probing, internal chaining. "internal chaining" means that you explicitly store a pointer to the next element in a chain, and "linear probing" means that you are using linear memory scan to find first available slot.

Ideally, your hash table should be completely policy-based, so you can easily replace linear probing with quadratic probing, for example.

It is not hard to implement a table having one fixed design. What is really hard is to make it completely policy-based.


Regards,
Novo
Re: Map implementation [message #51489 is a reply to message #51488] Tue, 02 April 2019 18:39 Go to previous messageGo to next message
Novo is currently offline  Novo
Messages: 890
Registered: December 2006
Experienced Contributor
Actually, the right name for "internal chaining" seems to be "Coalesced hashing".

Regards,
Novo
Re: Map implementation [message #51519 is a reply to message #51489] Mon, 08 April 2019 13:57 Go to previous messageGo to next message
cbpporter is currently offline  cbpporter
Messages: 1400
Registered: September 2007
Senior Contributor
I'm dissecting Index and noticed that Reindex calls ClearIndex and Free, while ClearIndex calls Free Smile.
Re: Map implementation [message #51521 is a reply to message #51519] Mon, 08 April 2019 22:37 Go to previous messageGo to next message
mirek is currently offline  mirek
Messages: 12098
Registered: November 2005
Ultimate Member
Yeah, that is uncessary.

Anyway, Index will probably get overhaul soon. I plan to drop unused features and slightly change the semantics to gain more performance and cleaner API and operation.

Mirek

[Updated on: Tue, 09 April 2019 09:29]

Report message to a moderator

Re: Map implementation [message #51522 is a reply to message #51521] Tue, 09 April 2019 14:57 Go to previous messageGo to next message
cbpporter is currently offline  cbpporter
Messages: 1400
Registered: September 2007
Senior Contributor
I re-implemented Index (copy&paste, trancode and cleanup) and should be a good starting point since it is smaller and simpler than a lot of maps out there.

I'll test mine thoroughly and benchmark, but before I benchmarked U++ vs. set vs. multiset, just to set my expectations accordingly.

U++ is pretty fast when compared to STL, but it does have exponential growth. With many items, in some tests, when multiplying items by 10, STL will go from like 250ms to 320, while Index will go from 250ms to 2 minutes Smile.

I used this simple test:
	const int SEQ = 10;
	const int MAX = 3000000 * SEQ;
	const int JUMP = 100;
	
	{
		Index<int> ind;
		
		StopWatch ts;
		
		for (int j = 0; j < MAX; j += JUMP)
			for (int i = j; i < j + SEQ; i++) {
				ind.Add(i);
				//ind.Debug(Cout());
			}
		
		Cout() << "U++ add " << ts.Elapsed() << "\n";
		
		ts.Reset();
		
		int count = 0;
		for (int i = 0; i < MAX; i++)
			if (ind.Find(i) != -1)
				count++;
		
		Cout() << "U++ find " << ts.Elapsed() << "\n";
		Cout() << count << "\n";
	}
	
	{
		std::set<int> ind;
		
		StopWatch ts;
		
		for (int j = 0; j < MAX; j += JUMP)
			for (int i = j; i < j + SEQ; i++) {
				ind.insert(i);
				//ind.Debug(Cout());
			}
		
		Cout() << "set add " << ts.Elapsed() << "\n";
		
		ts.Reset();
		
		int count = 0;
		for (int i = 0; i < MAX; i++)
			if (ind.find(i) != ind.end())
				count++;
		
		Cout() << "set find " << ts.Elapsed() << "\n";
		Cout() << count << "\n";
	}
	
	{
		std::multiset<int> ind;
		
		StopWatch ts;
		
		
		for (int j = 0; j < MAX; j += JUMP)
			for (int i = j; i < j + SEQ; i++) {
				ind.insert(i);
				//ind.Debug(Cout());
			}
		
		Cout() << "multiset add " << ts.Elapsed() << "\n";
		
		ts.Reset();
		
		int count = 0;
		for (int i = 0; i < MAX; i++)
			if (ind.find(i) != ind.end())
				count++;
		
		Cout() << "multiset find " << ts.Elapsed() << "\n";
		Cout() << count << "\n";
	}


Is this an OK first artificial benchmark?

For MAX as large as the example, this is the starting point where U++ begins to loose vs. STL.

Next I need to compare memory usage...

Edit: I broke the benchmark into add and find, and the exponential growth for many items is at the find phase.

[Updated on: Tue, 09 April 2019 15:03]

Report message to a moderator

Re: Map implementation [message #51523 is a reply to message #51522] Tue, 09 April 2019 17:16 Go to previous messageGo to next message
mirek is currently offline  mirek
Messages: 12098
Registered: November 2005
Ultimate Member
You have accidentally succeeded in attack on used hash mechanism - what you see is result of hash collisions. This is sad and something to deal with, however this would be much harder to achieve with String and quite unlikely to happen with real (aka random) data.

Still it is something I am worried about and will fix properly in the next version. For now, try to replace


inline dword FoldHash(dword h)
{
	return h - 362437 * SwapEndian32(h);
}

inline int& HashBase::Maph(unsigned _hash) const
{
	unsigned h = _hash & ~UNSIGNED_HIBIT;
	return map[mask & FoldHash(h)];
}



(In future, I will make '362437' number random prime, which will likely make this kind of attack impossible).

Mirek

EDIT: Forgot the change in Map.h

[Updated on: Wed, 10 April 2019 00:21]

Report message to a moderator

Re: Map implementation [message #51524 is a reply to message #51523] Wed, 10 April 2019 11:07 Go to previous messageGo to next message
mirek is currently offline  mirek
Messages: 12098
Registered: November 2005
Ultimate Member
Update: Upon further investigation... The proposed change is not perfect. Better solution:


inline dword FoldHash(dword h)
{
	return SwapEndian32(2833151717 * h);
}



Interestingly, in this particular case it runs slower. It took me a couple of hours to figure out why: It is cache issue. This much better FoldHash actually spreads hashes nicely through hash space (thus causing cache misses), while the previous one tended to put them close (cache hits). It had accidentally fixed this particular benchmark's collisions, but would probably fail in some other scenario.

Conclusion: At this number of elements, the benchmark is memory bound (for int), so well working hasmap will perform similary to std::set (actually, I think the benchmark favors std::set here, as sequential numbers will repeat the same path in the binary tree, which is more cache friendly).

Original FoldHash was too simplistic for ink keys, this new version should be harder to attack.

In future version of Index, I will try to randomize FoldHash (and other hashing ops).

Mirek
Re: Map implementation [message #51525 is a reply to message #51524] Wed, 10 April 2019 11:39 Go to previous messageGo to next message
cbpporter is currently offline  cbpporter
Messages: 1400
Registered: September 2007
Senior Contributor
I tested the "h - 362437 * SwapEndian32(h);" version and so far it seems to fix the collision issue and also regularly blows std::set out of the water. Needs more testing to cover enough cases. I also talked with a colleague and his hashmap version is supposedly 3x+ faster than stl, so it looks like it is very possible to greatly outperform it.

I will test the 2833151717 version too.

And with other types, like points and strings.

Anyway, Index has been highly educational. A lot of resources out there are either very basic, talking more about the principles of managing buckets or are about taking something that works very well and squeezing the last bits of performance out of it.

I'm still not experienced enough to tell how well things should be distributed when "map" inside HashBase grows, but I added plenty of debug methods...

Re: Map implementation [message #51526 is a reply to message #51524] Wed, 10 April 2019 16:13 Go to previous messageGo to next message
Novo is currently offline  Novo
Messages: 890
Registered: December 2006
Experienced Contributor
mirek wrote on Wed, 10 April 2019 05:07
Update: Upon further investigation... The proposed change is not perfect. Better solution:


inline dword FoldHash(dword h)
{
	return SwapEndian32(2833151717 * h);
}




Clang. Cpp14.

uppsrc/Core/Ops.h:289:9: error: call to 'SwapEndian32' is ambiguous
        return SwapEndian32(2833151717 * h);
               ^~~~~~~~~~~~
/home/ssg/dvlp/cpp/upp/git/uppsrc/Core/Ops.h:44:15: note: candidate function
inline dword  SwapEndian32(dword v)   { __asm__("bswap %0" : "=r" (v) : "0" (v)); return v; }
              ^
/home/ssg/dvlp/cpp/upp/git/uppsrc/Core/Ops.h:45:15: note: candidate function
inline int    SwapEndian32(int v)     { __asm__("bswap %0" : "=r" (v) : "0" (v)); return v; }
              ^
1 error generated.


Regards,
Novo
Re: Map implementation [message #51527 is a reply to message #51522] Wed, 10 April 2019 16:48 Go to previous messageGo to next message
Novo is currently offline  Novo
Messages: 890
Registered: December 2006
Experienced Contributor
cbpporter wrote on Tue, 09 April 2019 08:57
Is this an OK first artificial benchmark?

More scenarios for testing:

* Fill: insert a randomly shuffled sequence of unique keys into the table.
* Presized fill: like Fill, but first reserve enough memory for all the keys we'll insert, to prevent rehashing and reallocing during the fill process.
* Lookup: perform 100K lookups of random keys, all of which are in the table.
* Failed lookup: perform 100K lookups of random keys, none of which are in the table.
* Remove: remove a randomly chosen half of the elements from a table.
* Destruct: destroy a table and free its memory.



Regards,
Novo
Re: Map implementation [message #51528 is a reply to message #51525] Wed, 10 April 2019 17:09 Go to previous messageGo to next message
Novo is currently offline  Novo
Messages: 890
Registered: December 2006
Experienced Contributor
cbpporter wrote on Wed, 10 April 2019 05:39
I also talked with a colleague and his hashmap version is supposedly 3x+ faster than stl, so it looks like it is very possible to greatly outperform it.

IMHO, it is impossible to create one ideal hash table which will greatly outperform STL in all possible scenarios.
Let's take a look at two scenarios.

1. One million hash tables containing one hundred records.
2. One hash table containing one hundred million records.

You will need two completely different implementations in these cases.

Two more scenarios.

1. Add data once and search for data most of the time.
2. Add/remove data most of the time and search for it occasionally.

Again, you will need two completely different implementations.



Regards,
Novo
Re: Map implementation [message #51529 is a reply to message #51528] Wed, 10 April 2019 17:37 Go to previous messageGo to next message
mirek is currently offline  mirek
Messages: 12098
Registered: November 2005
Ultimate Member
Novo wrote on Wed, 10 April 2019 17:09
cbpporter wrote on Wed, 10 April 2019 05:39
I also talked with a colleague and his hashmap version is supposedly 3x+ faster than stl, so it looks like it is very possible to greatly outperform it.

IMHO, it is impossible to create one ideal hash table which will greatly outperform STL in all possible scenarios.



That is probably true, but mostly because at some point the limiting factor becomes cache / memory performance.

Quote:

Let's take a look at two scenarios.

1. One million hash tables containing one hundred records.
2. One hash table containing one hundred million records.

You will need two completely different implementations in these cases.


That might be true, however I do not see a way how to improve Index for either (I see some accumulated knowledge how to improve it for both, but that is another story).

Quote:

1. Add data once and search for data most of the time.
2. Add/remove data most of the time and search for it occasionally.


Ditto.

About the only thing that is in question is how to deal with collisions. Some advanced hashmaps might e.g. use binary trees to resolve collisions. I believe that it is not an overal gain (and the fact that it is not widely used in industry makes it likely) and that we should rather invest time to investigate proper hashing techniques.

Anyway, the real benchmark would be to create real world scenario and test there. IMO U++ Index/VectorMap wins that easily.

Mirek
Re: Map implementation [message #51530 is a reply to message #51526] Wed, 10 April 2019 17:50 Go to previous messageGo to next message
mirek is currently offline  mirek
Messages: 12098
Registered: November 2005
Ultimate Member
[quote title=Novo wrote on Wed, 10 April 2019 16:13]mirek wrote on Wed, 10 April 2019 05:07
Update: Upon further investigation... The proposed change is not perfect. Better solution:


inline dword FoldHash(dword h)
{
	return SwapEndian32(2833151717 * h);
}




OK, seems I have run out of dword range accidentally here... Smile

inline dword FoldHash(dword h)
{
	return SwapEndian32(0xa3613c16 * h);
}
Re: Map implementation [message #51546 is a reply to message #51529] Sat, 13 April 2019 20:58 Go to previous messageGo to next message
Novo is currently offline  Novo
Messages: 890
Registered: December 2006
Experienced Contributor
mirek wrote on Wed, 10 April 2019 11:37

That is probably true, but mostly because at some point the limiting factor becomes cache / memory performance.

There are ways to deal with this. From the top of my head: Cache-Conscious Data Structures, Cache-oblivious algorithms.


Regards,
Novo
Re: Map implementation [message #51547 is a reply to message #51529] Sat, 13 April 2019 21:07 Go to previous messageGo to next message
Novo is currently offline  Novo
Messages: 890
Registered: December 2006
Experienced Contributor
mirek wrote on Wed, 10 April 2019 11:37

Quote:

Let's take a look at two scenarios.

1. One million hash tables containing one hundred records.
2. One hash table containing one hundred million records.

You will need two completely different implementations in these cases.


That might be true, however I do not see a way how to improve Index for either (I see some accumulated knowledge how to improve it for both, but that is another story).

IMHO, this is impossible because in the first case your main concern is the memory usage. Tiny overhead multiplied by million is a huge problem. And in the second case performance is the main issue.


Regards,
Novo
Re: Map implementation [message #51548 is a reply to message #51529] Sat, 13 April 2019 21:25 Go to previous messageGo to previous message
Novo is currently offline  Novo
Messages: 890
Registered: December 2006
Experienced Contributor
mirek wrote on Wed, 10 April 2019 11:37
About the only thing that is in question is how to deal with collisions. Some advanced hashmaps might e.g. use binary trees to resolve collisions. I believe that it is not an overal gain (and the fact that it is not widely used in industry makes it likely) and that we should rather invest time to investigate proper hashing techniques.

Anyway, the real benchmark would be to create real world scenario and test there. IMO U++ Index/VectorMap wins that easily.

AFAIK, STL can't use open addressing or other such techniques because it is specified to maintain stable key/value addresses. U++ Index doesn't support that. This is why it is possible to make it faster.

About collisions. There are many ways to deal with them. A classic approach is to store chain either explicitly (a list) or implicitly (you know how to calculate address of the next colliding slot). I saw a paper from Microsoft recommending an overflow area ...

Basically, what I want to say is that there is a million of different ways to design a hash table.
IMHO, the best way is to split it into multiple policies, so you can easily replace one part of it without rewriting the whole data structure. Smile
I personally couldn't figure out how to do that. Smile


Regards,
Novo
Previous Topic: [Skylark][witz] How would you do this
Next Topic: Strange Architecture
Goto Forum:
  


Current Time: Thu Nov 14 23:20:44 CET 2019

Total time taken to generate the page: 0.01391 seconds