If the RNG is truly random, then it must have a non-zero probability of failing any test for randomness that uses a finite sample of the RNG's output. For instance, the RNG could theoretically return a sequence of all the same number. You can't "fix" this.
Sure, but there is a big difference between flakey and non-zero. A 1 in a million chance would probably be more acceptable, a 1 in a hundred chance, not so much.
You can often make the failure probability extremely low, like 1e-30. Then it is more likely the server performing the test would break than the test fails due to a chance.