- The test jig is probably pristine, so no hundreds of hours of telemetry data clogging up the internal storage.
- The test jig might be on ethernet whereas a lot of users would be using wifi.
- The test jig probably targets specific A -> B upgrades rather than testing progressive upgrade across every version that's ever existed.
- The test jig can't cover every permutation of config options.
- The test jig probably only does a bare minimal smoke test after the install, so if the problem takes a bit to kick in, it might not show up.
Not to say that it's certainly any of these, but all are possible contributors. In the coming days it'll become clearer what particular pattern the affected devices follows, and/or clever people with JTAG dongles will reverse engineer the problem and spill the beans.