I'm too lazy to run the exact numbers right now, but with "4 GB, 96% percent chance, three days" as the hypothesis, I think you'll find that an experimental result of "8 GB, 0% chance, 14 days" is highly statistically significant.
Edit: rough back of napkin estimate - you're not seeing an event in roughly 10x trials (2x number of bits and ~5x number of days). Given hypothesis is true your experimental result has a probability of (1-0.96)^10 = very very small. Conclusion: hypothesis is false.
There's a lot of variables that go into RAM errors, including manufacturing quality and condition of the ram, the dimm, the dimm slot, the motherboard generally, the power supply, the wiring, and the temperature of all of those. Google was known for cost cutting in their servers, especially early on; so I wouldn't be surprised if some of that resulted in higher bitflip rate than running in commercially available servers. Things like running bare motherboards, supported only on the edges cause excess strain and can impact resistance and capacitance of traces on the board (and in extreme cases, break the traces).
No it doesn't. You're assuming an even distribution of errors, which is very much not the case.
Google found that the average number of errors is around that range, but they also found that only one third of their servers had any errors in a year.