The code changes that changed the UI - even changing some CSS - would cause a screenshot comparison failure on certain steps in the test. If it is what was expected then we overwrote the old screenshot with a new one.
It isnt exactly the same as the TDD process coz sometimes you write the code first (e.g. CSS), eyeball it and then rebuild the screenshot if it looks correct.
I'd say it's close enough though.
I wont pretend it worked perfectly. The screenshot comparison algorithms were sometimes flaky and UIs can be surprisingly non-determinstic and you need to have a process to pull the database and update screenshots accordingly. However, it's the approach I'd prefer to take in the future (I havent done UIs for about 3 years now).
I also wasnt religious about covering every single scenario with tests but I could have been. The company moved fast and sometimes they just wanted quick and not reliable.
Or perhaps you've worked with some real TDD zealots, that doesn't sound like fun.
The folks I've worked with use these as guiding recommendations, not binding dictates.
For some of the UI stuff you mentioned elsewhere, I've seen a stronger focus on testing not just business logic, but view logic as well (where possible), but generally not to the degree of testing the rendered output of the UI. Maybe that's a thing somewhere, but I haven't personally seen it.
But yeah, if you dont have one of those tools or its super unreliable or it's only available in a language you cant use then you cant do this.
I dont really consider this to be bending the rules of TDD. It's more like next gen TDD IMO.