With that in mind, does this scheme offer any advantage over the much simpler setup of a user sending an inference request:
- directly to an inference provider (no API router middleman)
- that accepts anonymous crypto payments (I believe such things exist)
- using a VPN to mask their IP?
I'm not sure I understand what you mean by inference provider here? The inference workload is not shipped off the compute node once it's been decrypted to e.g. OpenAI, it's running directly on the compute machine on open source models loaded there. Those machines are cryptographically attesting to the software they are running. Proving, ultimately, that there is no software that is logging sensitive info off the machine, and the machine is locked down, no SSH access.
This is how Apple's PCC does it as well, clients of the system will not even send requests to compute nodes that aren't making these promises, and you can audit the code running on those compute machines to check that they aren't doing anything nefarious.
The privacy guarantee we are making here is that no one, not even people operating the inference hardware, can see your prompts.
You need to be careful with these claims IMO. I am not involved directly in CoCo so my understanding lacks nuance but after https://tee.fail I came to understand that basically there's no HW that actually considers physical attacks in scope for their threat model?
The Ars Technica coverage of that publication has some pretty yikes contrasts between quotes from people making claims like yours, and the actual reality of the hardware features.
https://arstechnica.com/security/2025/10/new-physical-attack...
My current understanding of the guarantees here is:
- even if you completely pwn the inference operator, steal all root keys etc, you can't steal their customers' data as a remote attacker
- as a small cabal of arbitrarily privileged employees of the operator, you can't steal the customers' data without a very high risk of getting caught
- BUT, if the operator systematically conspires to steal the customers' data, they can. If the state wants the data and is willing to spend money on getting it, it's theirs.
This is actually part of why we think it's so important to have the non-targetability part of the security stack as well, so that even if someone where to physically compromise some machines at a cloud provider, there would be no way for them to reliably route a target's requests to that machine.
xbox, playstation, and some smartphone activation locks.
Of course, you may note those products have certain things in common...
https://www.nvidia.com/en-us/data-center/solutions/confident...
https://developer.nvidia.com/blog/protecting-sensitive-data-...
that cannot be met, period. your asssumptions around physical protections are invalid or at least incorrect. It works for Apple (well enough) because of the high trust we place in their own physical controls, and market incentive to protect that at all costs.
> This is how Apple's PCC does it as well [...] and you can audit the code running on those compute machines to check that they aren't doing anything nefarious.
just based on my recollection, and I'm not going to have a new look at it to validate what I'm saying here, but with PCC, no you can't actually do that. With PCC you do get an attestation, but there isn't actually a "confidential compute" aspect where that attestation (that you can trust) proves that is what is running. You have to trust Apple at that lowest layer of the "attestation trust chain".
I feel like with your bold misunderstandings you are really believing your own hype. Apple can do that, sure, but a new challenger cannot. And I mean your web page doesn't even have an "about us" section.
From a brief glance at the white paper it looks like they are using TEE, which would mean that the root of trust is the hardware chip vendor (e.g. Intel). Then, it is possible for confidentiality guarantees to work if you can trust the vendor of the software that is running. That's the whole purpose of TEE.
We don't _quite_ have the funding to build out our own custom OS to match that level of attestation, so we settled for attesting to a hash of every file on the booted VM instead.
Despite recent news of vulnerabilities, I do think that hardware-root-of-trust will eventually be a great tool for verifiable security.
A couple follow-up questions:
1. For the ComputeNode to be verifiable by the client, does this require that the operator makes all source code running on the machine publicly available?
2. After a client validates a ComputeNode's attestation bundle and sends an encrypted prompt, is the client guaranteed that only the ComputeNode running in its attested state can decrypt the prompt? Section 2.5.5 of the whitepaper mentions expiring old attestation bundles, so I wonder if this is to protect against a malicious operator presenting an attestation bundle that doesn't match what's actually running on the ComputeNode.
1. The mechanics of the protocol are that a client will check that the software attested to has been released on a transparency log. dm-verity is what enforces that the hashes of the booted filesystem on the compute node match what was built and so those hashes are what are put on the transparency log, with a link to the deployed image that matches them. The point of the transparency log is that anyone could then go inspect the code related to that release to confirm that it isn't maliciously logging. So if you don't publish the code for your compute nodes then the fact of it being on the log isn't really useful.
So I think the answer is yes, to be compliant with OpenPCC you would need to publish the code for your compute nodes, though the client can't actually technically check that for you.
2. Absolutely yes. The client encrypts its prompt to a public key specific to a single compute node (well, technically it will encrypt the prompt N times for N specific compute nodes) where the private half of that key is only resident in the vTPM, the machine itself has no access to it. If the machine were swapped or rebooted for another one, it would be impossible for that computer to decrypt the prompt. The fact that the private key is in the vTPM is part of the attestation bundle, so you can't fake it
Folks may underestimate the difficulty of providing compute that the provider “cannot”* access to reveal even at gunpoint.
BYOK does cover most of it, but oh look, you brought me and my code your key, thanks… Apple's approach, and certain other systems such as AWS's Nitro Enclaves, aim at this last step of the problem:
- https://security.apple.com/documentation/private-cloud-compu...
- https://aws.amazon.com/confidential-computing/
NCC Group verified AWS's approach and found:
1. There is no mechanism for a cloud service provider employee to log in to the underlying host.
2. No administrative API can access customer content on the underlying host.
3. There is no mechanism for a cloud service provider employee to access customer content stored on instance storage and encrypted EBS volumes.
4. There is no mechanism for a cloud service provider employee to access encrypted data transmitted over the network.
5. Access to administrative APIs always requires authentication and authorization.
6. Access to administrative APIs is always logged.
7. Hosts can only run tested and signed software that is deployed by an authenticated and authorized deployment service. No cloud service provider employee can deploy code directly onto hosts.
- https://aws.amazon.com/blogs/compute/aws-nitro-system-gets-i...
Points 1 and 2 are more unusual than 3 - 7.
Folks who enjoy taking things apart to understand them can hack at Apple's here:
https://security.apple.com/blog/pcc-security-research/
* Except by, say, withdrawing the system (see Apple in UK) so users have to use something less secure, observably changing the system, or other transparency trippers.
Are you telling me customer services can't reset a customer's forgotten console login password?
I.e.: If the security/privacy guarantees really are as advertised, then ipso facto someone could store child porn in the system and the provider couldn't detect this.
Then by extension, any truly private system is exposing themselves to significant business, legal, and moral risk of being tarred and feathered along with the pedos that used their system.
It's a real issue, and has come up regularly with blockchain based data storage. If you make it "cencorship proof", the by definition you can't scrub it of illegal data!
Similarly, if cloud providers allow truly private data hosting, then they're exposing themselves to the risk of hosting data that is being stored with that level of privacy guarantees precisely because it is so very, very illegal.
(Or substitute: Stolen state secrets that will have the government come down on you like a ton of bricks. Stolen intellectual property. Blackmail information on humourless billionaires. Illegal gambling sites. Nuclear weapons designs. So on, and so forth.)
But both of these services exist, and have existed for hundreds of years, and don’t require service providers to go snooping though their customer’s possessions or communications.
But what they would be storing in this case is not illegal content. Straight up. Encrypted bits without a key are meaningless.
There is nothing stopping a criminal from uploading illegal content to Google drive as an encrypted blob. There's nothing Google can do about it, and there is no legal repercussion (to my knowledge) of holding such a blob.
Instead, iCloud, Google Drive, and similar all rely on being able to hash content post-upload for exactly that reason.
You might not know what change was made, or have any prior warning of the change. But you will be able to detect it happening. Which means an operator only gets to play that card once, after which nobody will trust them again.
It's even harder to do this plus the hard requirement of giving the NSA access.
Or alternatively, give the user a verifiable guarantee that nobody has access.
With the caveat that it's not clear what precisely is illegal about these payments and to what level it's illegal. It might be that a business isn't allowed to have any at all, or isn't allowed to use them for business, or can use them for business but can't exchange them for normal currency, or can do all that but has to check their customer's passport and fill out reams of paperwork.
https://bitcoinblog.de/2025/05/05/eu-to-ban-trading-of-priva...
Service: https://www.privatemode.ai/ Code: https://github.com/edgelesssys/privatemode-public
It's better than nothing, I guess...
But if you placed the server at the NSA, and said "there is something on here that you really want, it's currently powered on and connected to the network, and the user is accessing it via ssh", it seems relatively straightforward for them to intercept and access.
2) These attacks are actually worse than what I am pretty sure you are assuming (and so where I started my response), as you actually just need one hacked server and then you can simulate working servers on other hardware that isn't hacked by either stealing an attested key or stealing the attestation key itself. You often wouldn't even then need to have the hacked server anymore.
we are working on a challenge which is somewhat like a homomorphic encryption problem - I'm wondering if OpenPCC could help in some way? :
When developing websites/apps, developers generally use logs to debug production issues. However with wearables, logs can be privacy issue: imagine some AR glasses logging visual data (like someone's face). Would OpenPCC help to extract/clean/anonymize this sort of data for developers to help with their debugging?
If it's possible to anonymize on the wearable, that would be simpler.
The challenge is what does the anonymizer "do" to be perfect?
As an aside, IMO homomorphic encryption (still) isn't ready...
[1] https://techcommunity.microsoft.com/blog/azureconfidentialco...
Edit : reminds me of federated learning and FlowerLLM (training only AFAIR, not inference), like... yes, nice, I ALWAYS applaud any way to disentangle from proprieaty software and wall gardens... but like what for? What actual usage?
Edit on that too : makes me think of OpenAI Whisper as a service via /e/OS and supposedly anonymous proxying (by mixing), namely running STT remotely. That would be an actual potential usage... but IMHO that's low end enough to be run locally. So I'm still looking for an application here.
- useful thing (according to someone specific requirements, maybe hallucinations are OK, maybe not) that
- needs privacy (for example generating code that will be open source probably does not need that)
- can't be run locally
- can be trusted to actually process as said it does
> Gimme an actual example instead of downvoting, help me learn.
Basically you asked a bunch of people on a privacy minded forum, why should they be allowed to encrypt their data? What are you (they) hiding!? Are you a spammer???
Apple is beloved for their stance on privacy, and you basically called everyone who thinks that's more than marketing, a spammer. And before you start arguing no you didn't, it doesn't matter that you didn't, what matters is that that's how your comment made people feel. You can say they're the stupid ones because that's not what you wrote, but if you're genuinely asking for feedback about the downvotes, there you are.
You seriously can't imagine any reason to want to use an LLM privately other than to use it to write spam bots and to spam people? At the very least expand your scope past spamming to, like, also using it to write ransomware.
The proprietary models that can't be run locally are SOTA and local models, even if they can come close, simply aren't what people want.
I specifically like the privacy aspect (even though honestly I think most people in this forum claim they do and yet they rely on BigTech with which they their data, so IMHO most people on HN are not as demanding as you describe) the question is precisely about what to run.
FWIW I specifically keep a page on self hosting AI (you can check if you want to see if it's real at https://fabien.benetou.fr/Content/SelfHostingArtificialIntel... ) so again, the privacy aspect is crucial to me.
The question is ... what to actually run. What can't be run locally that would be useful. What model for which tasks. For example coding (which I don't think models are good enough for) typically would NOT need this because one wouldn't share any PII over there, hopefully would even instead publish the resulting code as open source.
So my provocation about spammer is... because they often ARE actual users of LLMs. They use LLM for their language capabilities, namely craft a message that is always slightly different yet convey (roughly) the same meaning (the scam) to avoid detection. A random person though using LLM might NOT be OK with hallucinations when they use it for their own private journal or chat.
So... what for?
Edit: I did use Apple for years, recommending by someone at Mozilla, and I moved away from them earlier this year precisely because even though they are better than others, e.g Google, IMHO it's just not good enough for me. No intermediary is better than one with closed source.
I realize this is just bad branding by apple but it's still hella confusing.