>
I also think people overhyped lidar because they don't understand itSpeaking as a person who understands it extremely well and who has an advanced degree in computer vision, I'm sure that internet randos did, but I promise the people who actually know about the failure modes of the different modalities did not. I don't really expect you to take my word for it, but maybe this will spark an interest in investigating the failure scenarios of 3D reconstruction using cameras in computer vision. Just know that Google is an absolute top tier juggernaut in the CV/ML/AI research world, and they don't use lidar out of ignorance.
> less sensor collision
This isn't a real thing for anyone doing a good job. A sensor can be good for a scenario or it can be bad for a scenario. More sensors feeding input only gives you gradations of accuracy instead of binary accuracy. Having gradations of accuracy is an unambiguously good thing. When you only have one sensor, you have no way to know whether in the moment it is feeding you an optical illusion. That's what it means for something to be an optical illusion. But when you have multiple sensors of different modalities, then you have meaningful information about whether local disagreement between the different modalities means that one is better or worse than the other, because you can contextually characterize the failure scenarios of each.
> It's not magic, it performs poorly in inclination weather and can have issues with resolution over range and data processing (although lidar does do a lot of things well).
Inclement, not inclination. And I hate to be the bearer of bad news, but cameras also do poorly in inclement weather and have issues with resolution over range, and the solutions are identical for both (superresolution, temporal blending, alternate wavelengths, stereo correspondence, etc).
Tesla people always say (said?) things like "Well humans only drive with their eyes, so cars should be able to as well," but that's not a true statement about what humans have in relation to what Teslas have. Humans have many more different sensor modalities than what Tesla's cameras give. Teslas have single-view fixed-focus cameras that, for much of the FOV, can only reconstruct structure from shape assumptions (object detection and classification) and inter-frame changes (optical flow) coupled with sensation of the vehicle's motion. That's all they get. It's not bad at all, especially coupled with advanced machine learning, but you do have more than that coupled with even more advanced machine learning. When you as a human drive, in addition to what Teslas have (you do also have them), you also have binocular stereopsis cues, autofocus lens convergence cues, vehicle-independent motion parallax cues, and the ability to manipulate shade cover so you don't get blinded. Are all those extra cues necessary for every scenario? No, obviously not. Do they help though? Yes. Try driving with only one eye open and without moving your body or head at all. You can absolutely do it, but you won't be as good as you would with both eyes open and free movement.