Calls to Coq and Z3 will be very slow and not competitive with GPU compute.
RE Coq/Z3 vs GPUs, I think about them as different kinds of things, and thus wouldn't compare directly. A solver like z3 encapsulates search tricks for tasks where brute force approaches would effectively run forever. Their tricks are symptomatically necessary -- effective GPU acceleration requires finding a clean data parallel formulation, and if a more naive algorithm is picked for that GPU friendliness, but at the expense of inflating the workload asymptotically, the GPU implementation would have high utilization, and high speedups vs a CPU version... but would still be (much) worse than Z3. Instead of comparing them as inherently different approaches, the question is if the practically relevant core of Z3 can be GPU accelerated while preserving its overall optimization strategies.
Which takes us back to the broader point of this thread of GPUs for synthesis... LLMs are successfully GPU accelerating program synthesis of more than just regex. It's amazing that half of github code now uses them! Closer to this paper, folks are making inroads on GPU accelerating SMT solvers that the more powerful (classical) program synthesizers use, e.g., https://github.com/muhos/ParaFROST .
With prompt engineering being so easy, and naive search/refinement over them so easy too, not comparing via benchmarks, vs brush-off related work section, to these general solvers seems strange.