Home » Developing U++ » External resources » Software optimization resources
Software optimization resources [message #30511] |
Thu, 06 January 2011 04:03 |
Novo
Messages: 1358 Registered: December 2006
|
Ultimate Contributor |
|
|
http://agner.org/optimize/
This web site contains five manuals, which describe everything you need to know about optimizing code for x86 and x86-64 family microprocessors, including optimization advices for C++ and assembly language, details about the microarchitecture and instruction timings of Intel, AMD and VIA processors, and details about different compilers and calling conventions.
If you liked Bit Twiddling Hacks you might like these manuals
Regards,
Novo
[Updated on: Thu, 06 January 2011 06:22] Report message to a moderator
|
|
|
Re: Software optimization resources [message #30523 is a reply to message #30511] |
Thu, 06 January 2011 13:33 |
|
Thanks Novo, it looks quite interesting.
Especially the assembler library caught my attention. It replaces some common c functions (mem{cpy,move,set},str{cat,copy,len,cmp}) with assembler versions using advanced instruction sets. Out of curiosity I launched a benchmark code from the NTL/STL comparison page. But to my great disappointment, it turns out there was no recognizable gain in speed If there is someone interested enough to try to figure out why, I would be very interested
But anyway, the manuals look good
Best regards,
Honza
|
|
|
|
Re: Software optimization resources [message #30535 is a reply to message #30530] |
Fri, 07 January 2011 11:01 |
|
Didier wrote on Thu, 06 January 2011 22:52 | Hi Novo,
I haven't tried the samples you are talking about but for special assembler functions to work (SSE, or whatever, ....) the memory has to be aligned on 4 bytes or 64 bytes or 128 bytes or more maybe.
The alignement depends on the assembler instructions used.
If the memory is not aligned correctly either you get bad results or just poor execution timings.
Maybe this is what happens in you're case
|
Hi Didier,
I believe you react on my post, even though I'm not Novo
The asmlib functions take the alignment into consideration. Their internals first take care of the unaligned part using clasic instructions and then process the rest using SSE or whatever available.
After some more thinking I believe that the real reason why there was no noticeable change was badly chosen benchmark. There was probably majority of the time spent in other functions than memory and string handling. I will try again with better constructed test code.
Honza
|
|
|
|
|
Goto Forum:
Current Time: Thu Mar 28 13:05:50 CET 2024
Total time taken to generate the page: 0.01180 seconds
|