> I think if the bar is to consider it not a replacement for knowledge work as long as there is a human in the loop.
That's where I put it personally, because of humans' limited amount of useful focus during a work day.
Anything that requires human attention will take some of that resource, and don't think models' rate of improvement will be fast enough to overcome that in the near future. Reviewing an output that is 99%, 99.9%, or 99.99% correct all take about the same amount of time, so the output needs to be correct enough not to need review before any knowledge work is replaced.