Overview
Examples
Screenshots
Comparisons
Applications
Download
Documentation
Tutorials
Bazaar
Status & Roadmap
FAQ
Authors & License
Forums
Funding Ultimate++
Search on this site
Search in forums












SourceForge.net Logo
Home » Developing U++ » U++ Developers corner » SSE2(/AVX) and alignment issues
Re: SSE2(/AVX) and alignment issues [message #30962 is a reply to message #30961] Sun, 30 January 2011 11:51 Go to previous messageGo to previous message
mirek is currently offline  mirek
Messages: 13976
Registered: November 2005
Ultimate Member
tojocky wrote on Sun, 30 January 2011 05:34

mirek wrote on Sat, 29 January 2011 21:29

tojocky wrote on Sat, 29 January 2011 03:23

mirek wrote on Sat, 29 January 2011 01:03



This is not a question. The question is whether _regular_ 'new' should return 16-byte aligned values or not. (And later, with AVX, 32, then maybe in 4 more years 64 etc...)

As long as we agree that allocating SSE2 stuff with 'new' is not a regular thing, we are at option 2..

Mirek




Mirek,

Can you give us an example of "new" realization and "allocator" realization?



struct AvxSomething {
  _m256 x;
};

Array<AvxSomething> foo;

foo.Add(new AvxSomthing); // not supported in option2. Actually, not even supported by any compiler today

foo.Add<AvxSomething>(); // supported in both options


Option2 could also support e.g.:

foo = New<AvxSomething>();
foo = new (Aligned<AvxSomething>) AvxSomething;
foo = NEW(AvxSomething);

Delete(foo);


(The crucial problem is that we need to know the type in new/delete).

Quote:


Why you are not agree with new realization (option 1)?



Because it wastes memory. It means that every single block allocated by 'new' must be a multiple 16 bytes allocated for SSE2.

Then 32 bytes for AVX and AVX is supposed to grow in width, so it can be easily 64 bytes etc.. So even if you request 24 bytes from new, you would waste 32 bytes (if we are 32 bytes aligned).

Moreover, U++ allocator today greatly benefits from the fact that alignment requirement is only 8 bytes. It would be possible to overcome this, but only at the price of quite a lot of wasted memory (or speed).

Still undecided. But if I consider that the issue does not stop at 32 bytes...

Mirek



What about to implement something like?
void* operator new( size_t size, size_t alignment ){
     return __aligned_malloc( size, alignment );
}

and in code:
AlignedData* pData = new( 16 ) AlignedData;

or
AlignedData* pData = new( 32 ) AlignedData;
or
AlignedData* pData = new( 64 ) AlignedData;

or
AlignedData* pData = new( 128 ) AlignedData;

?


User need to know when he uses sse2/3/4 data



This is still "option2".

I guess that the key difference is that

a) you cannot use regular 'new' for objects with special alignment
b) you cannot use regular 'delete' for objects with special alignment....

If you would want a) and b) work, you would definitely need ALL allocations to be aligned to highest possible alignment value.

That said, I guess that my suggestion of

New<FooClass>()

is somewhat superior, as it detects the alignment automatically...

Quote:


How do you want to implement the second method?


Well, that part is actually pretty simple:) Just if alignment >8, allocate more memory (add alignment + sizeof(void *)) and align, but void * before the aligned block to point back to allocated block.

Mirek
 
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Previous Topic: Ideas for TheRainbow
Next Topic: Issue tracking...
Goto Forum:
  


Current Time: Sun May 12 13:08:40 CEST 2024

Total time taken to generate the page: 0.01663 seconds