Error detecting and correcting signage would be able to correct or at least detect the error, and it’s also possible to encrypt or, um, sign the signage. None of those are terribly feasible with human readable signage.
A human can take a crowbar to a railroad track. A human can drop a brick from an overpass. But modifying a signed and error corrected fiducial is gonna be pretty tough.
Because it means that modifying the sign is not useful. (Unlike, say, modifying a speed limit sign.) Also, the fiducial can encode its orientation or position (perhaps in relation to other nearby signs... and could be a hash or checksum of its position, to save space), so the vehicle would be able to know there's a mismatch and thus mark the sign as suspect/unreliable if it was in any other spot or orientation.
There are other solutions to these. But the same problem you describe occurs if someone moved a human-readable sign (but without any way to checksum).
I think fiducials are not a panacea. They are just one additional data source in what needs to be a robust sensor-fusion approach. But they make a whole bunch of stuff in machine vision a LOT easier to solve. Machine-learning approaches have the same problems but with less opportunity to address them, less robustness, and more overhead.
...something like, "Forest St. Northbound CLOSED (lat,lon) <encrypted checksum>"?
https://www.extremetech.com/extreme/306346-researchers-tape-...
Human readable signage is potentially much easier to attack than machine-readable signage.
Physical world has a much different threat model than the Internet.