U++ forum: Welcome to the forum

Status & Roadmap

Authors & License

Funding Ultimate++

Search on this site

Search in forums

Home » Community » U++ community news and announcements » Painter refactored/optimized

Show: Today's Messages :: Show Polls :: Message Navigator
E-mail to friend

Switch to threaded view of this topic

Create a new topic

Submit Reply

Re: Painter refactored/optimized [message #50559 is a reply to message #50558]

Thu, 15 November 2018 11:43

Tom1
Messages: 1212
Registered: March 2007

Senior Contributor

Hi,

The difference is so large that it makes me wonder if ST allocates/resets any rasterizers at all on the fly?

Could the number of rendering threads be pre-selected and a sufficient number of rasterizers be pre-allocated for MT so that there would be no extra allocation/reset -penalty for re-using the same BufferPainter -- as was just introduced by BufferPainter::Create?

Best regards,

Tom

Report message to a moderator

Send a private message to this user

Re: Painter refactored/optimized [message #50560 is a reply to message #50559]

Thu, 15 November 2018 11:55

mirek is currently offline

mirek
Messages: 13975
Registered: November 2005

Ultimate Member

Tom1 wrote on Thu, 15 November 2018 11:43

Hi,

The difference is so large that it makes me wonder if ST allocates/resets any rasterizers at all on the fly?

The difference is that ST has just one rasterizer Smile

Smile

Quote:

Could the number of rendering threads be pre-selected and a sufficient number of rasterizers be pre-allocated for MT so that there would be no extra allocation/reset -penalty for re-using the same BufferPainter -- as was just introduced by BufferPainter::Create?

Perhaps, but let me try those optimizations I have in mind first...

(Note that while the ST/MT ratio is horrible, it is still <ms for both mt and st... I guess that if you would add that Clear into time, difference would be much less).

Mirek

Report message to a moderator

Send a private message to this user

Re: Painter refactored/optimized [message #50561 is a reply to message #50560]

Thu, 15 November 2018 12:14

Tom1
Messages: 1212
Registered: March 2007

Senior Contributor

Hi,

You say <ms??? ... you mean below one millisecond for MT??? I get something like 16 ms for MT and 300 us for ST... Confused

Confused

What exactly are your readings?

I bet your hardware is Superb! Mine is Core i7 4790K @ 4 GHz (4C/8T). Windows 10 Professional 64 bit. Compiled with MSBT17x64.

Do you have anything this old to test with?

Best regards,

Tom

[Updated on: Thu, 15 November 2018 12:18]

Report message to a moderator

Send a private message to this user

Re: Painter refactored/optimized [message #50562 is a reply to message #50561]

Thu, 15 November 2018 12:33

mirek is currently offline

mirek
Messages: 13975
Registered: November 2005

Ultimate Member

Tom1 wrote on Thu, 15 November 2018 12:14

Hi,

You say <ms??? ... you mean below one millisecond for MT??? I get something like 16 ms for MT and 300 us for ST... What exactly are your readings?

I bet your hardware is Superb! Mine is Core i7 4790K @ 4 GHz (4C/8T). Windows 10 Professional 64 bit. Compiled with MSBT17x64.

Do you have anything this old to test with?

Best regards,

Tom

Nope, that is just difference in testing, sorry, I have adopted it to my development package (which is benchmarks/LionBenchmark). There I am testing by repeatedly doing the paint, with the same BufferPainter, until I spend 1 second, then compute the time based on number of renders achieved.

It is sort of similar to having single global BufferPainter.

My numbers with your example are about the same for ST and half for MT - at least, those 8 cores show up Smile

Smile

Now if I insert some bechmarking code, it is obvious that those 8 ms in MT are spend by allocating / initializing memory...

Mirek

Report message to a moderator

Send a private message to this user

Re: Painter refactored/optimized [message #50563 is a reply to message #50562]

Thu, 15 November 2018 12:40

mirek is currently offline

mirek
Messages: 13975
Registered: November 2005

Ultimate Member

OK, I have just found that I have accidentally deleted that precious initialized memory in Create. So the new version is in the trunk. Changing your example with global BufferPainter now shows some pretty significant gains:

#include <CtrlLib/CtrlLib.h>
#include <Painter/Painter.h>

using namespace Upp;

class PainterBench : public TopWindow {
public:
	Painting p;
	FileSel fs;
	BufferPainter bpainter;
	
	void Open(){
		if(fs.ExecuteOpen("Select a painting to view")){
			p.Clear();
			p.Serialize(FileIn(fs.Get()));
		}
	}


	virtual bool Key(dword key, int count){
		Refresh();
		switch(key){
			case K_CTRL_O:
				Open();
				return true;
		}
		return false;
	}
	
	typedef PainterBench CLASSNAME;

	PainterBench(){
		Sizeable();

		p.Serialize(FileIn("C:/xxx/PainteTest/SomeRocks.painting"));
	}
		
	virtual void Paint(Draw &draw){
		int64 STtiming=0;
		int64 MTtiming=0;
		
		ImageBuffer ib(GetSize());
		{
			bpainter.Create(ib);
			bpainter.Co(true);
			bpainter.PreClipDashed();
			bpainter.Clear(White());
			bpainter.EvenOdd();
			
			int64 t0=usecs();
			bpainter.Paint(p);
			int64 t1=usecs();
			MTtiming=t1-t0;

			bpainter.Finish();
		}
		{
			bpainter.Create(ib);
			bpainter.Co(false);
			bpainter.PreClipDashed();
			bpainter.Clear(White());
			bpainter.EvenOdd();
			
			int64 t0=usecs();
			bpainter.Paint(p);
			int64 t1=usecs();
			STtiming=t1-t0;

			bpainter.Finish();
		}
		
		SetSurface(draw,Rect(ib.GetSize()),ib,ib.GetSize(),Point(0,0));
		
		double gain=(double)STtiming/(double)(0.1+MTtiming); // Avoid div by zero
		Title(Format("Rendering MT took %lld us, ST took %lld us, MT gain is %.2f",MTtiming,STtiming,gain));
	}
};

GUI_APP_MAIN
{
	PainterBench().Run();
}

[Updated on: Thu, 15 November 2018 12:41]

Report message to a moderator

Send a private message to this user

Re: Painter refactored/optimized [message #50564 is a reply to message #50563]

Thu, 15 November 2018 13:07

Tom1
Messages: 1212
Registered: March 2007

Senior Contributor

Hi!

Yes it indeed does. But even better: Now my real application rendering vector maps shows for the first time in MT Painter history consistent and significant MT/ST rendering speed gains of about 2.5x on the average with real data! Smile

Smile

Thank you Mirek very much! You really Rock!

Best regards,

Tom

Report message to a moderator

Send a private message to this user

Re: Painter refactored/optimized [message #50565 is a reply to message #50564]

Thu, 15 November 2018 13:23

Tom1
Messages: 1212
Registered: March 2007

Senior Contributor

Hi,

One minor issue: When in Paint with global BufferPainter and only calling bufferpainter.Create(ib); the rendered area does not change to current ib size. (E.g. After maximizing the window the bufferpainter will only render on the initial initial ib area leaving the rest white.) I need to additionally call bufferpainter.Co(true or false); to get the bufferpainter work on the current ib size.

This is not a problem for me, but maybe it would be more appropriate to handle the resizing in Create somehow.

Best regards,

Tom

[Updated on: Thu, 15 November 2018 13:26]

Report message to a moderator

Send a private message to this user

Re: Painter refactored/optimized [message #50566 is a reply to message #50565]

Thu, 15 November 2018 13:33

mirek is currently offline

mirek
Messages: 13975
Registered: November 2005

Ultimate Member

Tom1 wrote on Thu, 15 November 2018 13:23

Hi,

One minor issue: When in Paint with global BufferPainter and only calling bufferpainter.Create(ib); the rendered area does not change to current ib size. (E.g. After maximizing the window the bufferpainter will only render on the initial initial ib area leaving the rest white.) I need to additionally call bufferpainter.Co(true or false); to get the bufferpainter work on the current ib size.

This is not a problem for me, but maybe it would be more appropriate to handle the resizing in Create somehow.

Best regards,

Tom

Ops, thats a bug. Fix in trunk, hopefully...

Report message to a moderator

Send a private message to this user

Re: Painter refactored/optimized [message #50567 is a reply to message #50566]

Thu, 15 November 2018 13:44

Tom1
Messages: 1212
Registered: March 2007

Senior Contributor

And Yes! It works!

Thanks and best regards,

Tom

Report message to a moderator

Send a private message to this user

Re: Painter refactored/optimized [message #50569 is a reply to message #50567]

Fri, 16 November 2018 10:23

Tom1
Messages: 1212
Registered: March 2007

Senior Contributor

Hi Mirek,

While on the subject, I decided to do some testing of thread count for MT Painter. What I found was interesting: My typical map renders at roughly 250 ms with ST and 100 ms with default 10 thread MT. (On my hardware CPU_Cores() returns 8 and CoWork initializes a thread pool of 10 threads.)

So I tampered a little bit with CoWork.cpp, trying with different thread counts:

int CoWork::GetPoolSize()
{
	int n = GetPool().threads.GetCount();
//	return n ? n : CPU_Cores() + 2;
	return n ? n : 4;
}

CoWork::Pool::Pool()
{
	ASSERT(!IsWorker());

//	InitThreads(CPU_Cores() + 2);
	InitThreads(4);

	free = NULL;
	for(int i = 0; i < SCHEDULED_MAX; i++)
		Free(slot[i]);
	
	quit = false;
}

In this test I ended up with four threads which yield about same performance as 10 threads. When dropping to three threads or below, the MT gain started to fade away.

I think the optimal thread count for CoWork depends on the job's balance of CPU load and memory bandwidth. Also, the CPU and memory bus design changes this balance. As the new CPUs tend to offer a lot of cores (and concurrent threads), a simple or well optimized algorithm will easily saturate the memory channels with a reasonably small subset of cores being used. I'm not sure though, if there is much point in reducing threads (and therefore freeing cores for other tasks), if the memory bus will remain saturated anyway.

Best regards,

Tom

Report message to a moderator

Send a private message to this user

Re: Painter refactored/optimized [message #50570 is a reply to message #50569]

Fri, 16 November 2018 11:20

mirek is currently offline

mirek
Messages: 13975
Registered: November 2005

Ultimate Member

IMO, that is to be expected, as it is really 4C CPU...

10 threads default number for CoWork takes into account that perhaps some threads will do blocking operations (e.g. files). And in some workloads, hyperthreading has benefits.

That said, it is true that it would be nice to detect that threads are "wasted", but I am not sure how to do that...

Mirek

Report message to a moderator

Send a private message to this user

Re: Painter refactored/optimized [message #50571 is a reply to message #50570]

Fri, 16 November 2018 12:57

Tom1
Messages: 1212
Registered: March 2007

Senior Contributor

Hi,

IMO your default "CPU logical cores + 2" is a well considered compromise to keep the CPU working full time without wasting much resources. No worries.

Best regards,

Tom

Report message to a moderator

Send a private message to this user

Pages (3): [ « ‹ 1 2 3]

Switch to threaded view of this topic

Create a new topic

Submit Reply

Previous Topic:	Jsonize/Xmlize with lambda (and common template example)
Next Topic:	Critical issues to resolve for U++ 2018.1 - please suggest

Goto Forum:

-=] Back to Top [=-

[ Syndicate this forum (XML) ] [

] [

PDF

]

Current Time: Sun May 05 01:53:45 CEST 2024

Total time taken to generate the page: 0.02316 seconds