Overview
Examples
Screenshots
Comparisons
Applications
Download
Documentation
Tutorials
Bazaar
Status & Roadmap
FAQ
Authors & License
Forums
Funding Ultimate++
Search on this site
Search in forums












SourceForge.net Logo
Home » U++ Library support » U++ Core » About recent memset optimization
About recent memset optimization [message #60314] Fri, 17 November 2023 21:24
Tom1
Messages: 1212
Registered: March 2007
Senior Contributor
Hi,

I noticed that BufferPainter::ClearOp() has slowed down greatly due to recent memset changes.

An UHD/4K screen sized ImageBuffer can be cleared to white:

Revision 17045 : 740 us
Current rev. ST : 1990 us
Current rev. MT : 1880 us

It turned out that these changes in Mem.cpp cause the issue:
#if 0 // streaming does not seem to be benefical anymore *** HERE ***
#ifdef CPU_SSE2
	if(len >= 1024*1024 && 0) { // for really huge data, bypass the cache  *** HERE *** && 0
		auto Set4S = [&](int at) { data.Stream(t + at); };
		while(len >= 64) {
			Set4S(0*16); Set4S(1*16); Set4S(2*16); Set4S(3*16);
			t += 64;
			len -= 64;
		}
		_mm_sfence();
		e = t - 1;
	}
#endif
#endif *** HERE ***

So, "&& 0" and "#if 0" block streaming, and causes the loss of speed.

I have "Intel(R) Core(TM) i9-9900K CPU @ 3.60GHz 3.60 GHz" here. I do not know if the effect is similar on other platforms, but surely here the streaming with ST is the way to go for fast ImageBuffer clears.

Best regards,

Tom

Previous Topic: pitfall with storing integers in a stream
Next Topic: Weird result with Format();
Goto Forum:
  


Current Time: Sat May 04 10:07:25 CEST 2024

Total time taken to generate the page: 0.03292 seconds