Small time players like GE routinely fail to correctly sign industrial control software, the odds of people recording video paying enough attention to get it right and the meme crowd bothering to check even if they did seems vanishingly small without a lot of educational effort.
We are starting to see adoption of software supply-chains with SBOMS, albeit imperfectly. We are starting to see increased adoption of things like DMARC in the email space to better authentic the originator of an email. Both are highly imperfect systems ... but you can start kludging something together ... and if the incentives are there I think you can build out more of a workable system.
It's not the cryptography which is the problem. It's, who do you trust with the signing keys? The list inherently has to include every camera maker, despite that industry generally not having a great security culture, as well as every camera's country of origin, and every country with a security service capable of infiltrating some other country's camera maker. Which is probably all of them.
Worse, the keys have to be in the camera. Every camera. Break one of any model and you can forge images with it. Break one of any model and publish the break and you call into question every image from every camera of that type.
Then, even if a camera hasn't given up its keys, someone can use it to take a picture of a picture.
None of this requires a cryptographic break of public key signatures.
I've wanted to build a product in this space ever since I heard about deepfakes. Mix of keybase and appropriate file hash, and hash gen for subsets of sections of video. Maybe it needs to be a protocol, maybe a product, not sure, but the need seems apparent to me.
For example, HDCP is a DRM scheme where Intel convinces (or legally requires) every manufacturer of HDMI output devices (e.g. set-top boxes, Blu-ray players) in the world to encrypt certain video streams.
Then, Intel requires manufacturers of HDMI input devices (e.g. TVs) to purchase a license key that can decrypt those video streams. This license agreement also requires the manufacturer to design their device such that the device key cannot be easily discovered and the video content cannot be easily copied.
Then, Intel gets media companies to include some extra metadata in video media like Blu-ray discs. This metadata can contain revoked device keys, so that if a TV manufacturer violates the terms of the license agreement (e.g. leaks their key, or sells a device that makes copies of video content), that manufacturer's TVs won't be able to play new content that starts including their key in the revocation list.
Of course, Intel's HDCP master key was either leaked or reverse-engineered, so anyone can generate their own valid device keys. Intel will probably sue you if you do this, I guess.
(though if you have lots of time/effort/money you could still extract the key)
1) Authentic and only lightly edited with image manipulation software (e.g., cropped, color balanced, or text placed over top of the image) 2) Produced on a phone that has had to go through hardware hacks
Note that the guarantee in (1) wouldn't prevent someone from taking a photo of a TV screen. When I asked that original question, I had quite a few more details about how the certification might be done, how the credentials would be hosted, and how the results would be shown on a website.
Anyway, just asking this question was met with a storm of negative responses. I counted two dozen messages that were either neutral (asking for clarification) or else outright hostile before the first hesitantly positive message. My favorite hostile response was that allowing people to certify images as real would steal peoples' rights. I didn't follow the logic, but the guy who made the argument was really into it.
There were lots of comments about how using AI would be a better solution, some commenting on how Cannon already did it (and messed up gloriously), others stating they didn't have faith in hardware... it makes a fella never want to ask a question again.
In the end, I got an expert to speculate that the technology currently exists, and has existed for 5-10 years, to do this with a modern smartphone. However, unless a high-level engineer or executive argues that providing this feature will somehow be a competitive advantage, there is no appetite to provide this kind of feature.
My guess is likely because it seems like this would be impossible to implement without adding DRM to the smartphone and/or locking down Open Source image editors out of the attestation process. You would need to prevent access to the software, firmware, etc... otherwise the device could be virtualized or the program recompiled to circumvent the signature.
And for obvious reasons there's going to be pushback to adding that kind of DRM to smartphones. The tech does likely exist; this sounds to me like just normal attestation? It would likely hook into something like the Play Integrity API. A lot of people already hate the Play Integrity API though.
It's not the tech that's the problem, it is as you say, that people are hesitant to do it because it would require locking down the phone's software stack in a way that is widely understood by many developers and user advocates to be anti-user and in contrast to user rights to control their own devices and load their own software and/or firmware onto their devices.
I could maybe see an argument introducing some kind of signature to a raw camera input in firmware before it ever reached the user at all -- mostly just because devs seem to have given up the fight about custom firmware on a phone in general. But if you're talking about the phone signing the image after light editing like a crop has happened, at that point you're talking about moving this signature into user-space code, and while I'm sure that problem could have been explained better to you by the devs, it's not surprising to me at all that you'd get a hostile response to that suggestion because I don't see how it would be possible to do that without locking down user-space code.
Perhaps something like this is what your hostile responder was thinking of.
If you're talking someone on Twitter or Facebook is putting a photo in your feed claiming a human photographed BLM throwing bricks through a window, don't trust shit being posted on Facebook or Twitter no matter what, probably, but even there, unless the profile was hacked, you either trust the person who owns it or you don't. Nothing would prevent them from signing a forgery of reality that they legitimately forged themselves. Even with device-level keys, what are you trying to prove? You can pay actors to throw bricks through windows.
I guess the concern is this doesn't scale as well as asking Midjourney to do it, but I wonder to what extent that is even true. With 8 billion people on the planet and counting and a whole lot of them doing this shit, given the limited input bandwidth of human sense organs, there is seemingly some maximum saturation of bullshit a person can be exposed to that a lot of people have already hit, and having the Internet host even more of it doesn't mean they'll grow bigger eyes and a faster brain that can actually ingest more bullshit than it already does.
The Washington Post might not trust its photographers completely either (journalists making stuff up happens[0]), so they too might want proof the photos they're getting are real.
[0]: https://www.nytimes.com/2003/05/11/us/correcting-the-record-...
If I see a photo on twitter claiming to come from the Washington Post, it might not be.
If I see a photo in my facebook feed of a rioter, did it come from poster, or are they just reposting something else? Did that repost come from a newsource I trust, like the WP in this case, or from some reddit post, maybe edited or synthetically generated?
> Maybe they're showing you something made by AI that isn't real, but as the owner of their own signing key, nothing would prevent them from signing an AI-generated image
That's right. This only helps narrow the source down, then you still need to decide if you trust the originator. But I think a lot of the problems we've seen with social media disinformation is the wide dispersion of content claiming trustworthiness from a reputable source, falsely.
Unless you have reason to trust Bill himself you can't trust that he actually took the photo, or that it isn't ai generated. Although knowing that Bill isn't tech savvy enough to do those things might be enough.
There are industry initiatives around this already such as CAI https://en.m.wikipedia.org/wiki/Content_Authenticity_Initiat...
My take is that proving authenticity might not be something we can do with any degree of accuracy in a general sense. So if that is infeasible, then we need _some_ kind of mitigation. Something like CAI allows us to make the an assessment about the how much trust to give an informational source, probably taking into account multiple factors (known exploits in the source device, reputation of originator and what claims are be attached in the metadata). This might allow me to accept that a given video originated from a local tv station, rather than tiktoker edit, but I still need to asses if that station used genAI or has been compromised or whatever else. But that seems a much narrower reputational problem, that also will be contextual.
It’s a direct (and open source) implementation of public key cryptography into the LLM logit distribution.
The paraphrasing model/beam search needs work - feel free to pitch in :)
Would signing content with a cryptographically consistent encoding of this field be workable?
Similarly, I suspect watermarking LLM output is probably unworkable. The output of a smart model could be de-watermarked by fine tuning a dumb open source model on the initial output, and then regenerating the original output token by token, selecting alternate words whenever multiple completions have close probabilities and semantically equivalent. It would be a bit tedious to perfectly dial in, but I suspect it could be done.
And then ultimately, short text selections can have a lot of meaning with very little entropy to uniquely tag (e.g., covfefe).
[1] https://dl.acm.org/doi/abs/10.1145/2382448.2382450
Curious if Scott Aaronson solved this challenge...
Current LLMs have stylistic quirks imprinted on them by RLHF (ChatGPT's endless "it should be noted" and "it is important to remember that" verbiage is a good example), but they learned those from human writing.
That will be much harder to evade, but also pretty hard to implement.
I guess we will end up in the middle ground, where any non-signed image could be ai generate, but for most day to day use it’s ok.
If you want something to be deemed legit (gov press release, newspaper photo, etc) then just sign it. Very similar to what we do for web traffic (https)
If you create a digital record, then sign it, then that signature is only an attestation of may claim you make, not evidence of that claim. That is the problem with relying on technology to establish trust - the moment you attach an economic benefit to a technology you incentivize people to circumvent it, or to leverage it to commit fraud.
If AI will eventually generate say 10k by 10k images, I can resize to 2.001k by 1.999k or similar, and I just don't get how any subtle signal in the pixels can persist through that.
Maybe you could do something at the compositional level, but that seems restrictive to the output. Maybe something about like larger regions average color balance or something? But you wouldn't be able to fit many bits in there, especially when you need to avoid triggering accidentally.
Also: here are some play money markets for whether this will work:
https://manifold.markets/Ernie/midjourney-images-can-be-effe...
https://manifold.markets/Ernie/openai-images-have-a-useful-a...
It needs to publicly fail first to manufacture consent for full surveillance of every human interaction with any computer. Nobody would ever want that otherwise.
At the moment the internet is a wash with bullshit images. Its imperative that news outlets are at a high enough standard to actually prove the provenance of them.
You don't trust some bloke off facebook asserting that something is true, its the same for images.