undefined | Better HN

0 pointsmeheleventyone1y ago0 comments

It's telling IMO that they only want people opinions based on our notoriously faulty memories rather than sitting comparable situations next to one another in the game and simulation then analyzing them. Several things jump out watching the example video.

0 comments

3 comments · 1 top-level

GaggiX1y ago· 2 in thread

>rather than sitting comparable situations next to one another in the game and simulation then analyzing them.

That's literally how the human rating was setup if you read the paper.

meheleventyoneOP1y ago

I think you misunderstand me. I don't mean a snap evaluation and deciding between two very-short competing videos which is what the participants were doing. I mean doing an actual analysis of how well the simulation matches the ground truth of the game.

What I'd posit is that it's not actually a very good replication of the game but very good a replicating short clips that almost look like the game and the short time horizons are deliberately chosen because the authors know the model lacks coherence beyond that.

GaggiX1y ago

>I mean doing an actual analysis of how well the simulation matches the ground truth of the game.

Do you mean the PSNR and LPIPS metrics used in paper?

1 more reply

j / k navigate · click thread line to collapse