I will also add in my personal experience, while some filters work best together (like imu/gnss), we usually either used lidar or camera, not both. Part of the reason was combining them started requiring a lot more overhead and cross-sensor experts, and it took away from the actual problems we were trying to solve. While I suppose one could argue this is a cost issue (just hire more engineers!) I do think there's value in simplifying your tech stack whenever possible. The fewer independent parts you have the faster you can move and the more people can become an expert on one thing
Again Waymo's lead suggests this logic might be wrong but I think there is a solid engineering defense for moving towards just computer vision. Cameras are by far the best sensor, and there are tangible benefits other than just cost.