Overview
Examples
Screenshots
Comparisons
Applications
Download
Documentation
Tutorials
Bazaar
Status & Roadmap
FAQ
Authors & License
Forums
Funding Ultimate++
Search on this site
Search in forums












SourceForge.net Logo
Home » Extra libraries, Code snippets, applications etc. » C++ language problems and code snippets » Optimized memcmp for x86
Optimized memcmp for x86 [message #14308] Fri, 22 February 2008 11:07 Go to previous message
mirek is currently offline  mirek
Messages: 13975
Registered: November 2005
Ultimate Member
Well, this code seems to run 20% faster than intrinsic GCC memcmp on x86-64:

#ifdef COMPILER_GCC
inline dword _byteswap_ulong(dword x)
{
	asm("bswap %0" : "=r" (x) : "0" (x));
	return x;
}

inline uint64 _byteswap_uint64(uint64 x)
{
	asm("bswap %0" : "=r" (x) : "0" (x));
	return x;
}

inline word _byteswap_ushort(word x)
{
	__asm__("xchgb %b0,%h0" : "=q" (x) :  "0" (x));
	return x;
}
#endif

int MemCmp(const char *a, const char *b, size_t len)
{
	if(((size_t)a & 3) | ((size_t)b & 3))
		return memcmp(a, b, len);
	const dword *x = (dword *)a;
	const dword *y = (dword *)b;
	const dword *e = x + (len >> 2);
	while(x < e) {
		if(*x != *y)
			return int(_byteswap_ulong(*x) - _byteswap_ulong(*y));
		x++;
		y++;
	}
	if(len & 2)
		if(*(word *)x != *(word *)y)
			return int(_byteswap_ushort(*(word *)x) - _byteswap_ushort(*(word *)y));
	if(len & 1)
		return int(*((byte *)x + 2)) - int(*((byte *)y + 2));
	return 0;
}



(Obviously, when both areas are dword aligned, but that happens a lot...).

Mirek
 
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Previous Topic: multiple classes include-problem
Next Topic: What does , means?
Goto Forum:
  


Current Time: Wed Apr 24 22:54:55 CEST 2024

Total time taken to generate the page: 0.02930 seconds