Home » Community » U++ community news and announcements » Changes in hashing
|
Re: Changes in hashing [message #54177 is a reply to message #54110] |
Fri, 05 June 2020 12:28   |
Oblivion
Messages: 1202 Registered: August 2007
|
Senior Contributor |
|
|
Hello Mirek,
I'm somewhat confused about this.
Just to be clear: Does this mean that the client code can continue to use dword GetHashValue() variant on 64-bit machines?
I'm worried because, e.g, In Terminal ctrl I use a dword hash of incoming string data as unique ID for images and hyperlinks, in each cell.
Changing it to 64-bit would be very expensive (memory consumption will be significantly higher).
Truncating 64-bit to 32-bit probably won't do too much harm here, but still, it means information loss and I'd prefer to avoid that.
Or do I have to separately maintain the 32-bit hash funtions on 64-bit confiurations?
Best regards,
Oblivion
Github page: https://github.com/ismail-yilmaz
upp-components: https://github.com/ismail-yilmaz/upp-components
Bobcat the terminal emulator: https://github.com/ismail-yilmaz/Bobcat
|
|
|
Re: Changes in hashing [message #54178 is a reply to message #54177] |
Fri, 05 June 2020 12:40   |
 |
mirek
Messages: 14255 Registered: November 2005
|
Ultimate Member |
|
|
Oblivion wrote on Fri, 05 June 2020 12:28Hello Mirek,
I'm somewhat confused about this.
Just to be clear: Does this mean that the client code can continue to use dword GetHashValue() variant on 64-bit machines?
I'm worried because, e.g, In Terminal ctrl I use a dword hash of incoming string data as unique ID for images and hyperlinks, in each cell.
Changing it to 64-bit would be very expensive (memory consumption will be significantly higher).
Truncating 64-bit to 32-bit probably won't do too much harm here, but still, it means information loss and I'd prefer to avoid that.
Or do I have to separately maintain the 32-bit hash funtions on 64-bit confiurations?
Best regards,
Oblivion
Well, first of all, using GetHashValue as unique ID is probably not a good idea, but I guess you mean something slightly different there.
Second, simply truncating to 32-bit bit would indeed, for memhash produced numbers, produce inferior hashes, just like taking lower bits in previous 32-bit incarnation, as fast hashing algorithms tend to accumulate entropy in highest bits. But exactly for this reason we have (for quite a long time) FoldHash function, which IMO seems ideal for you scenario. It takes hash_t and produces dword, while bringing entropy back to lowest bits (BTW, look at implementation, I think it is one of more clever ideas from me... 
Mirek
|
|
|
Re: Changes in hashing [message #54179 is a reply to message #54178] |
Fri, 05 June 2020 13:02  |
Oblivion
Messages: 1202 Registered: August 2007
|
Senior Contributor |
|
|
Quote: first of all, using GetHashValue as unique ID is probably not a good idea, but I guess you mean something slightly different there.
Yeah, The IDs are not "really" meant to be unique. Bad wording. They are used in cache management, and dword is sufficient in my case.
Quote:But exactly for this reason we have (for quite a long time) FoldHash function, which IMO seems ideal for you scenario. It takes hash_t and produces dword, while bringing entropy back to lowest bits
Well, I did not know that, because, you know, lack of documentation... But this is good news. I'll test with FoldHash and look into the code ASAP.
Quote:
(BTW, look at implementation, I think it is one of more clever ideas from me...
You have a lot of clever ideas. The main reason why I prefer U++. 
Thank you!
Best regads,
Oblivion
Github page: https://github.com/ismail-yilmaz
upp-components: https://github.com/ismail-yilmaz/upp-components
Bobcat the terminal emulator: https://github.com/ismail-yilmaz/Bobcat
|
|
|
Goto Forum:
Current Time: Fri Apr 25 15:11:28 CEST 2025
Total time taken to generate the page: 0.02879 seconds
|