There are no materials channels or wireframe. It’s a volumetric 3D representation, like a picture made up of color blobs.
This method uses ai to generate even more unseen structure, so with relatively few images you can still represent a real scene with some level of fidelity. It will never need dynamic lights or animation because the point is just to look as close as possible to a still image. Splats do that FAR better and more efficiently than you ever could with dynamic lighting, triangulated models, and visual effects.
Plus, scaling dynamic lighting up has always been the Big Bad of computer graphics, and precomputation will always give us an amazing heuristic to use against it. Everything else basically tends towards not mattering: we can only absorb a finite number of details, but we live in a world with virtually infinite lights.
I bet in a couple of years it'll be standard for estate agents to show 3D views like this on their web sites, architects converting quick paintovers of existing sites to 3D models, improvements to Street View, and so on. Anywhere where you want a quick 3D view of a space based on a few photos taken on a smartphone and where accuracy isn't 100% important.
For things like games, it still follows the existing photogrammetry workflow (with all of those problems), but it might reduce the number of photos needed to create a point cloud.