> There are trade offs because while you gain different views of the environment from different sensors, the fusion becomes more complicated and has to be sorted out in software reliably in real-time.
If you're against having multiple sensors though, the rational conclusion would be to just have one sensor, but Tesla would be the first to tell you that one of the advantages their cars have over human drivers is they have multiple cameras looking at the scene already.
You already have a sensor fusion problem. Certainly more sensors add some complexity to the problem. However, if you have one sensor that is uncertain about what it is seeing, having multiple other sensors, particularly ones with different modalities that might not have problems in the same circumstance, it sure makes it a lot easier to reliably get to a good answer in real-time. Sure, in unique circumstances, you could have increased confusion, but you're far more likely to have increased clarity.