There is no passthrough in LXC, because it is containerization, not virtualization. You are running the same kernel and kernel modules for both systems.
Yes, I had installed the Nvidia drivers on both the host system and the container. The CUDA programs run on the host, but didn't inside the container, even after creating the proper device nodes in /dev in the container.
My guess would be, that the user space driver uses additional mechanism to communicate with the kernel module. I don't thing that it hauls large data buffers via device files; I would try strace to see, where it fails.