How much public money, time, and careers have been wasted chasing something that is already known not to work?
It seems better overall and per parameter than current work, with relative and absolute measurement.
Is there any research people are aware of that provides sub-mm level models? For 3D modeling purposes? Or is "classic" photogrammetry still the best option there?
We had a whole workshop on various monitoring technologies and the take home from the various video tools is that having highly trained grad students and/or techs watch and analyze the video is extremely slow and expensive.
I haven't worked with video in a while now, but I wonder if any labs are doing more automated identification these days. It feels like the kind of problem that is probably completely solvable if the right tech gets applied.
I work at an industrial plant we have been able to measure a lot of things simply by analyzing the pixels in the video. For example one application we have a camera pointed down at a conveyor belt. The conveyor belt is one color and objects on the belt are a distinct different color.
- we just count how many pixels in a given frame are a specific color/brightness. Then you can easily work out how much of the conveyor belt has material on it in any given frame.
So if you are tying to work out what section of a video has fish in it you could count how many pixels are a different color to the normal background color.
Did they have depth maps for all 62 million images or not?
Any FSD startup that put their money on LiDAR is even more screwed now.
Computer vision has 1-2 of those three, and I don't think we are near an AGI for self driving yet. Driving is IMO, an AGI level task.
Does you dataset have a crocodile in it? Does you monocular depth model get fooled by a billboard that's just a photo?
This is actually a pretty clever example, I tried a few billboards on the demo online and, as these models are regressive so they output the mean of the possible outputs, sometimes the model is perplexed and doesn't seem to know if to output something completely flat or that actually has a depth, and by being perplexed it outputs something in between.
How well would a moncular path with headlights moving toward it at night operate? How about in rain, snow, or fog?
I'm not saying LiDAR is the only way, but I don't see a reason to use this as a solution.
I'm not saying this isn't valuable. I used to work in 3D/metaverse space, and having depth from a single photo, and being able to recreate a 3D scene from that is very valuable, and is the future.