I am happy to hear that you are interested in OpenCL.
By the way, what about NTL with OpenCL? Can they work together?
Do not know yet.
What I know (because I have been studying it currently) is that NTL would not work with SSE2. The problem is that SSE2 has requirements for memory alignement that basic U++ memory allocatator is unable to satisfy.
I am however thinking about fixing this issue.... Well, maybe impossible for Array containers, but should be possible for Vector.