Get the Pthreads O'Reilly book. It's the best introduction to various ideas and primitives that you will need to think about parallel programs.
Also I would recommend OpenCL over CUDA if you're worried about being locked-in. It's got some of the lowest-common-denominator funk to it, but it's a lot more portable.