EDIT: Looks like a full list of controls is in the readme: https://github.com/antimatter15/splat#controls
The original idea was to be able to navigate around with just arrow keys (conceptually by turning yourself around in place and being able to walk back and forward).
If you integrate this with ThreeJS you'd have a lot of control options for free!
Whilst you're here, I have a question for you: It seems like you don't render read gaussians (I see sharp edges in many cases). Is this a bug on my side or is this an optimization made to be able to run fast? I created an issue to discuss if you prefer https://github.com/antimatter15/splat/issues/2
Like all the other implementations I have seen so far, this also makes the same mistake when projecting the ellipsoids in a perspective: First you calculate the covariance in 3D and then project that to 2D [1]. This approach only works with parallel / orthographic projections and applying it to perspectives leads to incorrect results. That is because perspective projections have three additional effects:
- Parallax movements (that is the view plane moves parallel to the ellipsoids) change the shape of the projected ellipse. E.g. a sphere only appears circular when in center of the view, once it moves to the edges it becomes stretched into an ellipse. This effect is manually counter balanced by this matrix I believe [2].
- Rotating an ellipse can change the position it appears at, or in other words creates additional translation. This effect is zero if the ellipse has one of its three axes pointing straight at the view (parallel to the normal of the view plane). But, if it is rotated 45°, then the tip of the ellipse that is closer to the view plane becomes larger through the perspective while the other end becomes smaller. Put together, this slightly shifts the center of the appearance away from the projected center of the ellipsoid.
- Conic sections can not only result in ellipses but also parabola and hyperbola. This however is an edge case that only happens when the ellipsoid intersects with the view plane and can probably be ignored as one would clip away such ellipsoids anyway.
The last two effects are not accounted for in these calculations in any of the implementations I have seen so far. What would be correct to do instead? Do not calculate the 3D covariance. Instead calculate the bounding cone around the ellipsoid which has its vertex at the camera position (perspective origin). Then intersect that with the view plane and the resulting conic section is guaranteed to be the correct contour of the perspective projection of the ellipsoid.
[0]: https://github.com/graphdeco-inria/gaussian-splatting [1]: https://github.com/antimatter15/splat/blob/3695c57e8828fedc2... [2]: https://github.com/antimatter15/splat/blob/3695c57e8828fedc2...
[0] https://repo-sam.inria.fr/fungraph/3d-gaussian-splatting/3d_...
Or viewed alternatively they're approximating the projection by assuming all of the Gaussian is at a fixed depth, which I suppose works if it is far enough away.
A projective transformation of a Gaussian seems somewhat annoying, though I assume someone will have done it before. Seems like it should be possible to do it with projective coordinates but the final projection to cartesian coordinates is tricky.
For what it's worth, projecting a contour is also wrong, the whole density changes which also affects the contours.
Basically its a semidense point cloud [1], but instead of a point, there is a blob which has been coloured, angled and scaled to match the input picture. This means they are optimised to be viewed from a certain distance.
Think of it like a 3d vector drawing, if you zoom in too much, or pull one part away, it all starts to look a bit funky.
[1]https://www.researchgate.net/publication/326621750/figure/fi...
The industry seems to be trying to move away from this with things like PBR (physical based rendering) and ray / path tracing which enables far better dynamic lighting.
Also, they are extremely space inefficient at the moment. A scene that would take a good traditional rendering engine a few dozen GB would take TB instead. Though, that might improve in the future with more optimization.
One exception to the above, where gaussian splatting might be interesting to see is procedural / generated content (possibly even animated). Especially for volumetric effects which currently use particle systems, like smoke, fire, clouds, flowing water, etc.
Its been around for ages, but It was never used because if you have a million points in a point cloud, you'd need to artistically manipulate a million points.
Its like 3d hair, its pretty simple, just render a billion hairs, but in practice its hard to make it look good.
Here we tell a machine learning model to adjust the angle, colour, shape and size of a million primitives (ie a square, circle, triangle etc.) so that it looks like a the photos we provide.
See the reflections here: https://www.youtube.com/watch?v=mD0oBE9LJTQ
This is also pretty good, but more subtle: https://www.youtube.com/watch?v=tJTbEoxxj0U
Lots of artefacts though, especially if I move the camera.
If you make it work within ThreeJS, you're going to leave a trace in the history of 3D on the web with that stuff!
late 90s bedroom me is shaking his head.