Home » U++ Library support » U++ Core » About recent memset optimization
About recent memset optimization [message #60314] |
Fri, 17 November 2023 21:24 |
Tom1
Messages: 1212 Registered: March 2007
|
Senior Contributor |
|
|
Hi,
I noticed that BufferPainter::ClearOp() has slowed down greatly due to recent memset changes.
An UHD/4K screen sized ImageBuffer can be cleared to white:
Revision 17045 : 740 us
Current rev. ST : 1990 us
Current rev. MT : 1880 us
It turned out that these changes in Mem.cpp cause the issue:
#if 0 // streaming does not seem to be benefical anymore *** HERE ***
#ifdef CPU_SSE2
if(len >= 1024*1024 && 0) { // for really huge data, bypass the cache *** HERE *** && 0
auto Set4S = [&](int at) { data.Stream(t + at); };
while(len >= 64) {
Set4S(0*16); Set4S(1*16); Set4S(2*16); Set4S(3*16);
t += 64;
len -= 64;
}
_mm_sfence();
e = t - 1;
}
#endif
#endif *** HERE ***
So, "&& 0" and "#if 0" block streaming, and causes the loss of speed.
I have "Intel(R) Core(TM) i9-9900K CPU @ 3.60GHz 3.60 GHz" here. I do not know if the effect is similar on other platforms, but surely here the streaming with ST is the way to go for fast ImageBuffer clears.
Best regards,
Tom
|
|
|
Goto Forum:
Current Time: Sat May 04 10:07:25 CEST 2024
Total time taken to generate the page: 0.03292 seconds
|