Putting aside any actually truthful captions, how do I know that "image of X" is actually an image of X?
Reading some of the Bellingcat investigations, and time spent, doesn't bode well.
I guess you could TinEye and index/hash the entire web's worth of rich media, then spot discrepancies (listed as X here, but Y there), but that seems horrendous in compute/bandwidth/storage terms.
Yes, but the usefulness of being able to automate that identification in near-real-time to debunk the firehose of falsehoods we get from everywhere would be astronomical
Anyone reading would have a huge edge in both being more accurately grounded in reality and being able to identify the biggest/hottest disinformation streams
No, it's not. This is done because stories with images perform better, and obtaining images (& licenses) for photos of every event is not always possible.
If I read a story about a riot and the included picture is from a different but similar urban disaster scene that shows buildings on fire and windows broken I come away from the article with an internal expectation of the disaster scene including fire damage and broken glass -- but that isn't necessarily the case.
This happened constantly with the reporting around the BLM social unrest.
Articles sell better with additional sensationalism, but when the narrative being espoused doesn't conform to reality then it is a distortion, regardless of the motivating factor.
Indeed, a major incentive towards inaccuracy in journalism is the pursuit of impact.