Overview
Examples
Screenshots
Comparisons
Applications
Download
Documentation
Tutorials
Bazaar
Status & Roadmap
FAQ
Authors & License
Forums
Funding Ultimate++
Search on this site
Search in forums












SourceForge.net Logo
Home » Developing U++ » U++ Developers corner » SSE2 and SVO optimization (Painter, memcpy....)
Re: BufferPainter::Clear() optimization [message #53977 is a reply to message #53751] Tue, 19 May 2020 00:02 Go to previous messageGo to previous message
mirek is currently offline  mirek
Messages: 14267
Registered: November 2005
Ultimate Member
What about this:

never_inline
void HugeFill(dword *t, dword c, int len)
{
	__m128i val4 = _mm_set1_epi32(*(int*)&c);
	auto Set4S = [&](int at) { _mm_stream_si128((__m128i *)(t + at), val4); };
	while((uintptr_t)t & 15) { // align to 16 bytes for SSE
		*t++ = c;
		len--;
	}
	while(len >= 16) {
		Set4S(0);
		Set4S(4);
		Set4S(8);
		Set4S(12);
		t += 16;
		len -= 16;
	}
	while(len--)
		*t++ = c;
	_mm_sfence();
}

void Fill6(dword *t, dword c, int len)
{
	if(len >= 4) {
		__m128i val4 = _mm_set1_epi32(*(int*)&c);
		auto Set4 = [&](int at) { _mm_storeu_si128((__m128i *)(t + at), val4); };
		if(len > 4*1024*1024 / 4) {
			HugeFill(t, c, len);
			return;
		}
		while(len >= 16) {
			Set4(0);
			Set4(4);
			Set4(8);
			Set4(12);
			t += 16;
			len -= 16;
		}
		if(len & 8) {
			Set4(0);
			Set4(4);
			t += 8;
		}
		if(len & 4) {
			Set4(0);
			t += 4;
		}
	}
	if(len & 3)
		t[0] = t[(len & 2) >> 1] = t[(len & 2) & ((len & 1) << 1)] = c;
}

[Updated on: Tue, 19 May 2020 09:01]

Report message to a moderator

 
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Previous Topic: Should we still care about big-endian CPUs?
Next Topic: TheIDE crash after switching package
Goto Forum:
  


Current Time: Thu Aug 14 08:07:34 CEST 2025

Total time taken to generate the page: 0.09810 seconds