Have any of your places used a service such as Saucelabs or Browserstack or rolled their own similar inhouse, or seen such as https://percy.io/how-it-works (random example; not affiliated or recommending this)?
I am hope I was not being too rude about it, not my intent, mostly surprising to me because a service like Browserstack is a decade and a half old already and the concept predates that.
My first software job out of college was actually a QA Automation / SDET position, wrote an automated framework with Ruby + Selenium + Browserstack which did take screenshots of the app, but the app loaded dynamic content and there were frequent feature adjustments so no two screenshots were ever identical.
All other jobs I've had since then have been writing smart contracts for Ethereum apps - 100% backend, (I hate having to deal with frontend) so all our tests were just units & coverage & what have you.
I suppose if your environment holds constant and your features don't change frontend structure or behavior (eg refactors), then this is what you should expect.
Though, do note that this only works because my app is based on a tick/game-loop system without callbacks; if this was the standard game-development pattern of callbacks & message handling (especially w/ React / JS) to invoke events, it wouldn't work, because the timing would be slightly different each time, and an enemy would be a few pixels to the left/right of its position in the past run.
If you want to go further down this direction there are all kinds of cool things you can do. There are ways to like XOR bitmaps so pixels which aren't identical show up as white and the rest are black, and the like; if you're working with something else you can look into perceptual hashing although that's a lot more computationally expensive.
Oh! And edge detection! Canny edge detectors are cheap and deterministic and wonderful for all manner of this storm.