The fact that (most) humans manage to drive around safely and successfully in current roads proves that the information needed exists in the pixel-space (not just current image, but say current + history). We don't yet have stacks that can successfully map everything needed from this information but I don't think Dr. Karpathy ever claimed that.
(I am not a principal engineer but a mere PhD student who argues daily with people on how RGB information is underappreciated and under utilized)