How we’re responding to The NYT’s data demands in order to protect user privacy (opens in new tab)

(openai.com)

284 pointsBUFU9mo ago324 comments

324 comments

_jab9mo ago

> How will you store my data and who can access it?

> The content covered by the court order is stored separately in a secure system. It’s protected under legal hold, meaning it can’t be accessed or used for purposes other than meeting legal obligations.

> Only a small, audited OpenAI legal and security team would be able to access this data as necessary to comply with our legal obligations.

So, by OpenAI's own admission, they are taking abundant and presumably effective steps to protect user privacy here? In the unlikely event that this data did somehow leak, I'd personally be blaming OpenAI, not the NYT.

Some of the other language in this post, like repeatedly calling the lawsuit "baseless", really makes this just read like an unconvincing attempt at a spin piece. Nothing to see here.

tptacek9mo ago

No, there is a whole news cycle about how chats you delete aren't actually being deleted because of a lawsuit, they essentially have to respond. It's not an attempt to spin the lawsuit; it's about reassuring their customers.

VanTheBrand9mo ago

The part where they go out of the way to call the lawsuit baseless is spin though, and mixing that with this messaging does exactly that, presents a mixed message. The NYT lawsuit is objectively not baseless. OpenAI did train on the Times and chat gpt does output information from that training. That’s the basis of the lawsuit. NYT may lose, this could end up being considered fair use, it might ultimately be a flimsy basis for a lawsuit, but to say it’s baseless (and with nothing to back that up) is spin and makes this message less reassuring.

2 more replies

mhitza9mo ago

My understanding is that they have to keep chats based on an order, *as a result of their previous accidental deletion of potential evidence in the case*[0].

And per their own terms they likely only delete messages "when they want to" given the big catch-alls. "What happens when you delete a chat? -> It is scheduled for permanent deletion from OpenAI's systems within 30 days, unless: It has already been de-identified and disassociated from your account"[1]

[0] https://techcrunch.com/2024/11/22/openai-accidentally-delete...

[1] https://help.openai.com/en/articles/8809935-how-to-delete-an...

ofjcihen9mo ago

They should include the part where the order is a result of them deleting things they shouldn’t have then. You know, if this isn’t spin.

Then again I’m starting to think OpenAI is gathering a cult leader like following where any negative comments will result in devoted followers or those with something to gain immediately jumping to its defense no matter how flimsy the ground.

1 more reply

mmooss9mo ago

> It's not an attempt to spin the lawsuit; it's about reassuring their customers.

It can be both. It clearly spins the lawsuit - it doesn't present the NYT's side at all.

2 more replies

conartist69mo ago

It's hard to reassure your customers if you can't address the elephant in the room. OpenAI brought this on themselves by flaunting copyright law and assuring everyone else that such aggressive and probably-illegal action would be retroactively acceptable once they were too big to fail.

lxgr9mo ago

If the stored data is found to be relevant to the lawsuit during discovery, it becomes available to at least both parties involved and the court, as far as I understand.

sashank_15099mo ago

Obviously openAI’s point of view will be their point of view. They are going to call this lawsuit baseless, they would not be fighting it or else.

ivape9mo ago

To me it's pretty clear the way this will happen. You will need to buy additional credits or subscriptions through these LLMs that feedback payment to things like NYT and book publishers. It's all stolen. I don't even want to hear it. This company doesn't want to pay up and willing to let user's privacy hang in the balance to draw the case out until they get sure footing with their device launches or the like (or additional markets like enterprise, etc).

2 more replies

hiddencost9mo ago

> So, by OpenAI's own admission, they are taking abundant and presumably effective steps to protect user privacy here? In the unlikely event that this data did somehow leak, I'd personally be blaming OpenAI, not the NYT.

I am not an Open AI stan, but this needs to be responded to.

The first principle of information security is that all systems can be compromised and the only way to secure data is to not retain it.

This is like saying "well I know they didn't want to go sky diving but we forced them to go sky diving and they died because they had a stroke mid air, it's their fault they died.".

Anyone who makes promises about data security is at best incompetent and at worst dishonest.

nhecker9mo ago

Data is a toxic asset. -- https://www.schneier.com/essays/archives/2016/03/data_is_a_t...

JohnKemeny9mo ago

> Anyone who makes promises about data security is at best incompetent and at worst dishonest.

Shouldn't that be "at best dishonest and at worst incompetent"?

I mean, would you rather be a competent person telling a lie or an incompetent person believing you're competent?

1 more reply

pritambarhate9mo ago

May be because you are not OpenAI user. I am. I find it useful and I pay for it. I don't want my data to be retained beyond what's promised in the Terms of Use and Privacy Policy.

I don't think the Judge is equipped to handle this case if they don't understand how their order jeopardies the privacy of millions of users worldwide who don't even care about NYT's content or bypassing their paywalls.

conartist69mo ago

You live on a pirate ship. You have no right to ignore the ethics and law of that just because you could be hurt in conflict related to piracy

DrillShopper9mo ago

The OpenAI Privacy Policy specifically allows them to keep data as required by law.

mmooss9mo ago

> who don't even care about NYT's content or bypassing their paywalls.

Whether or not you care is not relevant, and is usually the case for customers. If a drug company resold an expensive cancer drug without IP, you might say 'their order jeopardies the health of millions of users worldwide who don't even care about Drug Co's IP.

If the NYT is right - I can only guess - then you are benefitting from the NYT IP. Why should you get that without their consent and for free - because you don't care?

> (jeapordizes)

... is a strong word. I don't see much risk - the NYT isn't going to de-anonymize users and report on them, or sell the data (which probably would be illegal). They want to see if their content is being used.

molf9mo ago

It would help tremendously if OpenAI would make it possible to apply for zero data retention (ZDR). For many business needs there is no reason to store or log any request at all.

In theory it is possible to apply (it's mentioned on multiple locations in the documentation), but in practice requests are just being ignored. I get that approval needs to be given, and that there are barriers to entry. But it seems to me they mention zero-data retention only for marketing purposes.

We have applied multiple times and have yet to receive ANY response. Reading through the forums this seems very common.

miles9mo ago

> I get that approval needs to be given, and that there are barriers to entry.

Why is approval necessary, and what specific barriers (before the latest ruling) prevent privacy and no logging from being the default?

OpenAI’s assurances have long been met with skepticism by many, with the assumption that inputs are retained, analyzed, and potentially shared. For those concerned with genuine privacy, local LLMs remain essential.

AlecSchueler9mo ago

> what specific barriers (before the latest ruling) prevent privacy and no logging from being the default?

Product development?

ArnoVW9mo ago

My understanding is that they log 30 days by default, for handling of bugs. And that you can request 0 days. This is from their documentation

lcnPylGDnU4H9OF9mo ago

> And that you can request 0 days.

Right but the problem they're having is that the request is ignored.

pclmulqdq9mo ago

The missing ingredient is money.

jewelry9mo ago

not just money. How are you going to support this client’s support ticket if there is no log at all?

1 more reply

belter9mo ago

If this stands I dont think they can operate in the EU

bunderbunder9mo ago

I highly doubt this court order affects people using OpenAI services from the EU, as long as they're connecting to EU-based servers.

1 more reply

1vuio0pswjnm79mo ago

"You can also request zero data retention (ZDR) for eligible endpoints if you have a qualifying use-case. For details on data handling, visit our Platform Docs page."

https://openai.com/en-GB/policies/row-privacy-policy/

1. You can request it but there is no promise the request will be granted.

Defaults matter. Silicon Valley's defaults are not designed for privacy. They are designed for profit. OpenAI's default is retention. Outputs are saved by default.

It is difficult to take the arguments in their memo ISO objection to the preservation order seriously. OpenAI already preserves outputs by default.

lmm9mo ago

> In theory it is possible to apply (it's mentioned on multiple locations in the documentation), but in practice requests are just being ignored. I get that approval needs to be given, and that there are barriers to entry. But it seems to me they mention zero-data retention only for marketing purposes.

What's the betting that they just write it on the website and never actually implemented it?

sigmoid109mo ago

Tbf the approach seems pretty standard. Azure also only offers zero retention to vetted customers and otherwise retains data for up to 30 days to monitor and detect abuse. Since the possibilities for abuse are so high with these models, it would make sense that they don't simply give that kind of privilege to everyone - if only to cover their own legal position.

supriyo-biswas9mo ago

I wonder whether OpenAI legal can make the case for storing fuzzy hashes of the content, in the form of ssdeep[1] hashes or content-defined chunks[2] of said data, instead of the actual conversations themselves.

After all, since the NYT has a very limited corpus of information, and supposedly people are generating infringing content using their APIs, said hashes can be used to compare whether such content has been generated.

I'd rather have them store nothing, but given the overly broad court order I think this may be the best middle ground. Of course, I haven't read the lawsuit documents and don't know if NYT is requesting far more, or alleging some indirect form of infringement which would invalidate my proposal.

[1] https://ssdeep-project.github.io/ssdeep/index.html

[2] https://joshleeb.com/posts/content-defined-chunking.html

delusional9mo ago

I haven't been able to find any of the supporting documents, but the court order makes it seem like OpenAI has been unhelpful in producing any alternative during the conversation.

For example, the judge seems to have asked if it would be possible to segregate data that the users wanted deleted from other data, but OpenAI has failed to answer. Not just denied the request, but simply ignored it.

I think it's quite likely that OpenAI has taken the PR route instead of seriously engaging with any way to constructively honor the request for retention of data.

paxys9mo ago

Yeah, try explaining any of these words to a lawyer or judge.

sthatipamala9mo ago

The judges in these technical cases can be quite sophisticated and absolutely do learn terms of art. See Oracle v. Google (Java API case)

1 more reply

fc417fc8029mo ago

I thought that's what GPT was for.

m4639mo ago

"you are a helpful law assistant."

landl0rd9mo ago

"You are a long-suffering clerk speaking to a judge who's sat the same federal bench for two decades and who believes 'everything is computer' constitutes a deep technical insight."

LandoCalrissian9mo ago

Trying to actively circumvent the intention of a judges order is a pretty bad idea.

Aeolun9mo ago

That’s not circumvention though. The intent of the order is to be able to prove that ChatGPT regurgitates NYT content, not to read the personal communications of all ChatGPT users.

girvo9mo ago

Deeply, deeply so. In fact so much so that people who suggest them show they've (luckily) not had to interact with the legal system much. Judges take an incredibly dim view of that kind of thing haha

bigyabai9mo ago

All of that does fit on a real spiffy whitepaper. Let's not fool around though, every ChatGPT session is sent directly into an S3 bucket that some three-letter spook backs up onto their tapes every month. It's a database of candid, timestamped text interactions from a bunch of rubes that logged in with their Google account - you couldn't ask for a juicer target unless you reinvented email. Of course it's backdoored, you can't even begin to try proving me wrong.

Maybe I'm alone, but a pinkie-promise from Sam Altman does not confer any assurances about my data to me. It's about equally as reassuring as a singing telegram from Mark Zuckerberg dancing to a song about how secure WhatsApp is.

landl0rd9mo ago

Of course I can't even begin trying to prove you wrong. You're making an unfalsifiable statement. You're pointing to the Russel's Teapot of sigint.

It's well-established that the American IC, primarily NSA, collects a lot of metadata about internet traffic. There are some justifications for this and it's less bad in the age of ubiquitous TLS, but it generally sucks. However, legal protections against directly spying on the actual decrypted content of Americans are at least in theory stronger.

Snowden's leaks mentioned the NSA tapping inter-DC links of Google and Yahoo, so I doubt if they had to tap links that there's a ton of voluntary cooperation.

I'd also point out that trying to parse the unabridged prodigious output of the SlopGenerator9000 is a really hard task unless you also use LLMs to do it.

8 more replies

7speter9mo ago

Maybe I’m wrong, and maybe this was discussed previously, but of course openai keeps our data, they use it for training!

1 more reply

rl39mo ago

>Of course it's backdoored, you can't even begin to try proving me wrong.

On the contrary.

>Maybe I'm alone, but a pinkie-promise from Sam Altman does not confer any assurances about my data to me.

I think you're being unduly paranoid. /s

https://www.theverge.com/2024/6/13/24178079/openai-board-pau...

https://www.wsj.com/tech/ai/the-real-story-behind-sam-altman...

farts_mckensy9mo ago

Think of all the complete garbage interactions you'd have to sift through to find anything useful from a national security standpoint. The data is practically obfuscated by virtue of its banality.

3 more replies

sega_sai9mo ago

Strange smear against NYT. If NYT has a case, and the court approves that, it's bizarre to to use the court order to smear NYT. If there is no case, "Open"AI will have a chance to prove its case in court.

lxgr9mo ago

The NYT is, in my view, exploiting a systematic weakness of the US legal system here, i.e. extremely wide reaching discovery laws with almost no regard for the privacy of parties not involved to a given dispute, or aspects of their lives not relevant to the dispute at hand.

Of course it's out of self-serving interests, but I find it hard to disagree with OpenAI on this one.

JumpCrisscross9mo ago

> with almost no regard for the privacy of parties not involved to a given dispute

Third-party privacy and relevance is a constant point of contestion in discovery. Exhibit A: this article.

1 more reply

thinkingtoilet9mo ago

The privacy onus is entirely on the company. If Open AI is concerned about user privacy then don't collect that data. End of story.

2 more replies

Arainach9mo ago

What right to privacy? There is no right to have your interactions with a company (1) remain private, nor should there be. Even if there was you agree to let OpenAI do essentially whatever they want with your data - including hand it over to the courts in response to a subpoena.

(1) With limited well scoped exclusions for lawyers, medical records, erc.

9 more replies

visarga9mo ago

NYT wants it both ways. When they were the ones putting freelancer articles into a database to rent, they argued against enforcing copyright and for supporting the new industry, and that it was too hard to revert their original assumptions. Now they absolutely love copyright.

https://harvardlawreview.org/blog/2024/04/nyt-v-openai-the-t...

moefh9mo ago

Another way of looking at it is that they lost that case over 20 years ago, and have been building their business model for 20 years accordingly.

In other words, they want everyone to be forced to follow the same rules they were forced to follow 20 years ago.

tptacek9mo ago

They're a party to the case! Saying it's baseless isn't a "smear". There is literally nothing else they can say (other than something synonymous with "baseless", like "without merit").

lucianbr9mo ago

Oh they definitely can say other things. It's just that it would be inconvenient. They might lose money.

I wonder if the laws and legal procedures are written considering this general assumption that a party to a lawsuit will naturally lie if it is in their interest. And then I read articles and comments about a "trust based society"...

3 more replies

mmooss9mo ago

They could say nothing about the merits of the case.

wyager9mo ago

Lots of people abuse the legal system in various ways. They don't get a free pass just because their abuse is technically legal itself.

tootie9mo ago

It's PR. OpenAI stole mountains of copyrighted content and are trying to make NYT look like bad guys. OpenAI would not be in the position of defending a lawsuit if they hadn't done something that is very likely illegal. OpenAI can also end this requirement right now by offering a settlement.

eviks9mo ago

And if NYT has no case, but the court approves it, is that still bizarre?

tomhow9mo ago

Related discussion:

OpenAI slams court order to save all ChatGPT logs, including deleted chats - https://news.ycombinator.com/item?id=44185913 - June 2025 (878 comments)

hombre_fatal9mo ago

You know how it's always been a meme that you'd be mortally embarrassed if your browser history ever leaked?

Imagine how much worse it is for your LLM chat history to leak.

It's even worse than your private comms with humans because it's a raw look at how you are when you think you're alone, untempered by social expectations.

vitaflo9mo ago

WTF are you asking LLMs and why would you expect any of it to be private?

threecheese9mo ago

This product is positioned as a personal copilot, and future iterations (based on leaked plans, may or may not be true) as a wholly integrated life assistant.

Why would a customer expect this not to be private? How can one even know how it could be used against them, when they do t even know what’s being collected or gleaned from collected data?

I am following these issues closely, as I am terrified that my “assistant” will some day prevent me from obtaining employment, insurance, medical care etc. And I’m just a non law breaking normie.

A current day example would be TX state authorities using third party social/ad data to identify potentially pregnant women along with ALPR data purchased from a third party to identify any who attempt to have an out of state abortion, so they can be prosecuted. Whatever you think about that law, it is terrifying that a shift in it could find arbitrary digital signals being used against you in this way.

phendrenad29mo ago

"If you have nothing to hide, you have nothing to fear". Oh okay, send me your entire unabridged hard drive contents, chat logs, phone records, and banking records. You thought they were private? Why? They're just bits in a computer? "BuT ChAtGpT iS DiFfErEnT" literally how?

cedws9mo ago

Lot of people using ChatGPT as a therapist. I tried it but it was too sycophantic.

hombre_fatal9mo ago

It's not that the convos are necessarily icky.

It's that it's like watching how someone might treat a slave when they think they're alone. And how you might talk down to or up to something that looks like another person. And how pathetic you might act when it's not doing what you want. And what level of questions you outsource to an LLM. And what things you refuse to do yourself. And how petty the tasks might be, like workshopping a stupid twitter comment before you post it. And how you copied that long text from your distraught girlfriend and asked it for some response ideas. etc. etc. etc.

At the very least, I'd wager that it reveals that bit of true helpless patheticness inherent in all of us that we try so hard to hide.

Show me your LLM chat history and I will learn a lot about your personality. Nothing else compares.

3 more replies

ofjcihen9mo ago

“Write a song in the style of Slipknot about my dumb inbred dogs. I love them very much but they are…reaaaaally dumb.”

To be fair the song was intense.

conartist69mo ago

Hey OpenAI! In your "why is this happening" you left some bits out.

You make it sound like they're mad at you for no reason at all. How unreasonable of them when confronted with such honorable folks as yourselves!

delusional9mo ago

I have no time for this circus.

The technology anarchists in this thread need perspective. This is fundamentally a case about the legality of this product. In the extreme case, this will render the whole product category of "llm trained on copyrighted content" illegal. In that case, you will have been part of a copyright infringement on a truly massive scale. The users of these tools do NOT deserve privacy in the light of the crimes alleged.

You do not get to claim to protect the privacy of the customers of your illegal venture.

energy1239mo ago

> Consumer customers: You control whether your chats are used to help improve ChatGPT within settings, and this order doesn’t change that either.

Within "settings"? Is this referring to the dark pattern of providing users with a toggle "Improve model for everyone" that doesn't actually do anything? Instead users must submit a request manually on a hard to discover off-app portal, but this dark pattern has deceived them into think they don't need to look for it.

sib3019mo ago

Can you please elaborate?

energy1239mo ago

To opt-out of your data being trained on, you need to go to https://privacy.openai.com and click the button "Make a Privacy Request".

1 more reply

curtisblaine9mo ago

Yes, could you please explain why toggling "Improve model for everyone" off doesn't do anything and provide a link to this off-portal app that you mention?

amluto9mo ago

It appears that the “Zero Data Retention” APIs they mention are something that customers need to request access to, and that it’s really quite hard to get this access. I’d be more impressed if any API user could use those APIs.

JimDabell9mo ago

I believe Apple’s agreement includes this, at least when a user isn’t signed into an OpenAI account:

> OpenAI must process your request solely for the purpose of fulfilling it and not store your request or any responses it provides unless required under applicable laws. OpenAI also must not use your request to improve or train its models.

— https://www.apple.com/legal/privacy/data/en/chatgpt-extensio...

I wonder if we’ll end up seeing Apple dragged into this lawsuit. I’m sure after telling their users it’s private, they won’t be happy about everything getting logged, even if they do have that caveat in there about complying with laws.

fc417fc8029mo ago

> I’m sure after telling their users it’s private, they won’t be happy about everything getting logged,

The ZDR APIs are not and will not be logged. The linked page is clear about that.

singron9mo ago

If OpenAI cared about our privacy, ZDR would be a setting anyone could turn on.

atleastoptimal9mo ago

I've always assumed that anything sent to any company's hosted API will be logged forever. To assume otherwise always seemed naive, like thinking that apps aren't tracking your web activity.

lxgr9mo ago

Assuming the worst is wise, settling for the worst case outcome without any fight seems foolish.

fragmede9mo ago

privacy nhilism is a decision all on its own

morsch9mo ago

I'd only call it nihilism if you are in agreement with the grandparent and then do it anyway. Other choices are pretending it's not true (denialism), or just not thinking about (ignorance). Or you complicate your life by not uploading your private info.

Barrin929mo ago

not really, it's basically just being anti fragile. Consider any corporate entity that interacts with you to be an Eldritch horror from outer space that wants to siphon your soul, because that's effectively what it is, and keep your business with them to a minimum.

It's just realism. Protect your private data yourself, relying on companies or governments to do it for you is like the saying goes, letting a tiger devour you up to the neck and then ask it to stop at the head

yoaviram9mo ago

>Trust and privacy are at the core of our products. We give you tools to control your data—including easy opt-outs and permanent removal of deleted ChatGPT chats (opens in a new window) and API content from OpenAI’s systems within 30 days.

No you don't. You charge extra for privacy and list it as a feature on your enterprise plan. Not event paying pro customer get "privacy". Also, you refuse to delete personal data included in your models and training data following numerous data protection requests.

that_was_good9mo ago

Except all users can opt out. Am I missing something?

It says here:

> If you are on a ChatGPT Plus, ChatGPT Pro or ChatGPT Free plan on a personal workspace, data sharing is enabled for you by default, however, you can opt out of using the data for training.

Enterprise is just opt out by default...

https://help.openai.com/en/articles/8983130-what-if-i-want-t...

bartvk9mo ago

Indeed. Click your profile in the top right, click on the settings icon. In Settings, select "Data Controls" (not "privacy") and then there's a setting called "Improve the model for everyone" (not "privacy" or "data sharing") and turn it off.

1 more reply

atoav9mo ago

Not sharing you data with other users does not mean the data of a deleted chat are gone, those are very likely two completely different mechanisms.

And whether and how they use your data for their own purposes isn't touched by that either.

agos9mo ago

what about all the rest of the data they use for training, there's no opt out from that

baxtr9mo ago

This is a typical "corporate speak" / "trustwahsing" statement. It’s usually super vague, filled with feel-good buzzwords, with a couple of empty value statements sprinkled on top.

paxys9mo ago

> Does this court order violate GDPR or my rights under European or other privacy laws?

> We are taking steps to comply at this time because we must follow the law, but The New York Times’ demand does not align with our privacy standards. That is why we’re challenging it.

That's a lot of words to say "yes, we are violating GDPR".

38362936489mo ago

No, they're not, because the GDPR has an explicit exception for when a court orders that a company keeps data for discovery. It'd only be a GDPR violation if it's kept after this case is over.

lompad9mo ago

This is not correct.

> Any judgment of a court or tribunal and any decision of an administrative authority of a third country requiring a controller or processor to transfer or disclose personal data may only be recognised or enforceable in any manner if based on an international agreement, such as a mutual legal assistance treaty, in force between the requesting third country and the Union or a Member State, without prejudice to other grounds for transfer pursuant to this Chapter.

So if, and only if, an agreement between the US and the EU allows it explicitly, it is legal. Otherwise it is not.

dragonwriter9mo ago

That's what they are trying to suggest, because they are still trying to use the GDPR as part of their argument challenging the US court order. (Kind of a longshot to get a US court to agree that the obligation of a US party to preserve evidence related to a suit in US courts under US law filed by another US party is mitigated by European regulations in any case, even if their argument that such preservation would violate obligations that the EU had imposed on them.)

kelvinjps9mo ago

Maybe the will ot store the chats of the European users?

esafak9mo ago

Could a European court not have ordered the same thing? Is there an exception for lawsuits?

lxgr9mo ago

There is, but I highly doubt a European court would have given such an order (or if they did, it would probably be axed by a higher court pretty quickly).

There's decades of legal disputes in some European countries on whether it's even legitimate for the government to mandate your ISP or phone company to collect metadata on you for after-the-fact law enforcement searches.

Looking at the actual data seems much more invasive than that and, in my (non-legally trained) estimate doesn't seem like it would stand a chance at least in higher courts.

1 more reply

nraynaud9mo ago

Isn't Altman collecting millions of eye scans? Since when did he care about privacy?

WorldPeas9mo ago

So how is this going to impact cursor's privacy mode, which is required by many companies for compliant usage of AI editors? For the uninitiated, in the web console this looks like:

Privacy mode (enforced across all seats)

OpenAI Zero-data-retention (approved)

Anthropic Zero-data-retention (approved)

Google Vertex AI Zero-data-retention (approved)

xAi Grok Zero-data-retention (approved)

did this just open another can of worms?

qmarchi9mo ago

Likely, they're using OpenAI's Zero-Retention APIs where there's never data stored in the first place.

So nothing?

JumpCrisscross9mo ago

> OpenAI's Zero-Retention APIs

Do we know if the court order covers these?

1 more reply

8note9mo ago

at least, openai zero-data-retention will by court order be full retention.

im excited that the law is going to push for local models

blerb7959mo ago

The linked page specifically mentions that these ZDR APIs are not impacted.

> This does not impact API customers who are using Zero Data Retention endpoints under our ZDR amendment.

dangus9mo ago

I think the court order doesn’t quite go against as many norms as OpenAI is claiming. It’s very reasonable to retain data pertinent to a case, and NYT’s case almost certainly revolves around finding out copyright infringement damages, which are calculated based on the number of violations (how many users queried ChatGPT and were returned verbatim copyrighted material from NYT).

If you don’t retain that data you’re destroying evidence for the case.

It’s not like the data is going to be given to anyone, it’s only gong to be used for limited legal purposes for the lawsuit (as OpenAI confirms in this article).

And honestly, OpenAI should have just not used copyrighted data illegally and they would have never had this problem. I saw NYT’s filing and it had very compelling evidence that you could get ChatGPT to distribute verbatim copyrighted text from the Times without citation.

lxgr9mo ago

It absolutely goes against norms in many countries other than the US, and the data of residents/citizens of these countries are affected too.

> It’s not like the data is going to be given to anyone, it’s only gong to be used for limited legal purposes for the lawsuit (as OpenAI confirms in this article).

Nobody other than both parties to the case, their lawyers, the court, and whatever case file storage system they use. In my view, that's already way too much given the amount and value of this data.

dangus9mo ago

Countries other than the US aren't part of this lawsuit. ChatGPT operates in the US under US law. I don't know if they have separated data storage for other countries.

I don't believe you would be considered to be violating the GDPR if you are complying with another court order, because you are presumably making a best effort to comply with the GDPR besides that court order.

You're saying it's unreasonable to store data somewhere for a pending court case? Conceptually you're saying that you can't preserve data for trials because the filing cabinets might see the information. That's ridiculous, if that was true then it would be impossible to perform discovery and get anything done in court.

1 more reply

danenania9mo ago

Putting the merits of this specific case and positive vs. negative sentiments toward OpenAI aside, this tactic seems like it can be used to destroy any business or organization with customers who place a high value on privacy—without actually going through due process and winning a lawsuit.

Imagine a lawsuit against Signal that claimed some nefarious activity, harmful to the plaintiff, was occurring broadly in chats. The plaintiff can claim, like NYT, that it might be necessary to examine private chats in the future to make a determination about some aspect of the lawsuit, and the judge can then order Signal to find a way to retain all chats for potential review.

However you feel about OpenAI, this is not a good precedent for user privacy and security.

dangus9mo ago

I'm confused at how you think that NYT isn't going through due process and attempting to win a lawsuit.

The court isn't saying "preserve this data forever and ever and compromise everyone's privacy," they're saying "preserve this data for the purposes of this court while we perform an investigation."

IMO, the NYT has a very good argument here that the only way to determine the scope of the copyright infringement is to analyze requests and responses made by every single customer. Like I said in my original comment, the remedies for copyright infringement are on a per-infringement basis. E.g., everytime someone on LimeWire downloads Song 2 by Blur from your PC, you've committed one instance of copyright infringement. My interpretation is that NYT wants the court to find out how many times customers have received ChatGPT responses that include verbatim New York Times content.

1 more reply

fc417fc8029mo ago

That's not entirely fair. The argument isn't "users are using the service to break the law" but rather "the service is facilitating law breaking". To fix your signal analogy suppose you could use the chat interface to request copyrighted material from the operator.

1 more reply

tptacek9mo ago

And honestly, OpenAI should have just not used copyrighted data illegally and they would have never had this problem

The whole premise of the lawsuit is that they didn't do anything unlawful, so saying "just do what the NYT wanted you to do" isn't interesting.

dangus9mo ago

No, you're misinterpreting how information discovery and the court system works.

The NYT made an argument to a judge about what they think is going on and how they think the copyright infringement is taking place and harming them. In their filings and hearings they present the reasoning and evidence they have that leads them to believe that a violation is occurring. The court makes a judgment on whether or not to order OpenAI to preserve and disclose information relevant to the case to the court.

It's not "just do what NYT wanted you to do," it's "do what the court orders you to do based on a lawsuit brought by a plaintiff and argued to the court."

I suggest you read the court filing: https://nytco-assets.nytimes.com/2023/12/NYT_Complaint_Dec20...

CjHuber9mo ago

Even though how they responded is definitely controversial, I‘m glad that they did publicize some response to it. After reading about it in the news yesterday and seeing no response on their side yet, I was worried that they would just keep silent

dataflow9mo ago

> ChatGPT Enterprise and ChatGPT Edu: Your workspace admins control how long your customer content is retained. Any deleted conversations are removed from our systems within 30 days, unless we are legally required to retain them.

I'm confused, how does this not affect Enterprise or Edu? They clearly possess the data, so what makes them different legally?

oxw9mo ago

Enterprise has an exemption granted by the judge

> When we appeared before the Magistrate Judge on May 27, the Court clarified that ChatGPT Enterprise is excluded from preservation.

dataflow9mo ago

Oh I missed that part, thanks. I wonder why. I guess the judge assumes it isn't being used for copyright infringement, but other plans might be?

2 more replies

Kiyo-Lynn9mo ago

Lately I’m not even sure if the things I say on OpenAI are really mine or just part of the platform. I never used to think much when chatting, but knowing some of it might be stored for a long time makes me feel uneasy. I’m not asking for much. I just want what I delete to actually be gone.

Caelus99mo ago

Honestly, this incident makes me feel that it is really difficult to draw a clear line between “protecting privacy” and “obeying the law”. On the one hand, I am very relieved that OpenAI stood up and said “no”. After all, we all know that these systems collect everything by default, which makes people a little panic. But on the other hand, it sounds very strange that the court can directly say “give me all the data”, even those that users explicitly delete. Moreover, this also shows that everyone actually cares about their information and privacy now. No one wants to be used for anything casually.

mosdl9mo ago

Its funny that OpenAI is complaining, they don't mind saying copyright doesn't apply to them if it makes them money.

ivape9mo ago

In retrospect, Bezos did the smartest thing by buying the Washington Post. In retrospect, Google did a great thing by working on a deal with Reddit. Content repositories/creators are going to sue these LLM companies in the West until they make licensing agreements. If I were OpenAI, I'd work hard to spend the money they raised to literally buyout as many of these outlets as possible.

How much could the NYT back catalog be worth? Just buy it, ask the Saudis.

tptacek9mo ago

You mean, like, a pretty big fraction of everybody who comments on this site?

mmooss9mo ago

People here advocate for private use, not profit-making corporate use.

rasengan9mo ago

The internet is the battle of the narratives.

kingkawn9mo ago

Once the data is kept it is a matter of time til a new must-try use for it will be born

dumbmrblah9mo ago

So is this for all chats going forward or does it include conversations retroactively?

steve_adams_869mo ago

Presumably moving forward, because otherwise the data retention policies wouldn't have been followed correctly (from what I understand)

tmaly9mo ago

I wonder if this would affect temporary chats too?

65109mo ago

The harm this is doing and will do (regardless) seems to exceed the value of the NYT.

If a company is subject to a US court order that violates EU law, the company could face legal consequences in the EU for non-compliance with EU law.

The GDPR mandates specific consent and legal bases for processing data, including sharing it.

Assuming it is legal to share it for legal purposes one cant sufficiently anonymize the data. It needs to be accompanied by user data that allows requests to download it and for it to be deleted.

I wonder what the fine would be if they just delete it per user agreement.

I also wonder, could one, in the US, legally promise the customer they may delete their data then chose to keep it indefinitely and share it with others?

wand3r9mo ago

Does anyone know how this can be enforced?

The ruling and situation aside, to what degree is it possible to enforce something like this and what are the penalties? Even in GDPR and other data protection cases, it seems super hard to enforce. Directives to keep or delete data basically require system level access, because the company can always CRUD their data whenever they want and whatever is in their best interest. Data can ask to be produced to a court periodically and audited which could maybe catch an individual case, I guess. There is basically no way to know without literally seizing the servers in an extreme case. Also, the consequences in most cases are a fine.

mmooss9mo ago

This isn't the executive branch of the US government, which has Constitutional powers. It's a private company and the court can at least enforce massive penalties, presumptions against them at trial (causing them to lose), and contempt of court. Talk to a lawyer before you try something like it.

imiric9mo ago

> the court can at least enforce massive penalties

A.k.a. the cost of doing business.

1 more reply

landonxjames9mo ago

Repeatedly calling the lawsuit baseless feels like it makes Open AI’s point a lot weaker. They obviously don’t like the suit, but I don’t think you can credibly argue that there aren’t tricky questions around the use of copyrighted materials in training data. Pretending otherwise is disingenuous.

sigilis9mo ago

They pay their lawyers and whoever made this page a lot for the express purpose of credibly arguing that it is very clearly totally legal and very cool to use of any IP they want to train their models.

Could you with a straight face argue that the NYT newspaper could be a surrogate girlfriend for you like a GPT can be? They maintain that it is obviously a transformative use and therefore not an infringement of copyright. You and I may disagree with this assertion, but you can see how they could see this as baseless, ridiculous, and frivolous when their livelihoods depend on that being the case.

lxgr9mo ago

Does anybody know if this also applies to "temporary chats" on ChatGPT?

Given that it's not explicitly mentioned as data not being affected, I'm assuming it is.

miles9mo ago

> But now, OpenAI has been forced to preserve chat history even when users "elect to not retain particular conversations by manually deleting specific conversations or by starting a 'Temporary Chat,' which disappears once closed," OpenAI said.

https://arstechnica.com/tech-policy/2025/06/openai-says-cour...

mediumsmart9mo ago

Its a newspaper. They are sold for a price, not to one person and they dont come with an nda. They become part of history and Society.

dvt9mo ago

> Does this court order violate GDPR or my rights under European or other privacy laws?

> We are taking steps to comply at this time because we must follow the law, but The New York Times’ demand does not align with our privacy standards. That is why we’re challenging it.

So basically no, lol. I wonder if we'll see the GDPR go head-to-head with Copyright Law here, that would be way more fun than OpenAI v NYT.

udev40969mo ago

The irony is palpable here

john2x9mo ago

Does this mean that if I can get ChatGPT to generate copyrighted text, they'll get in trouble?

FireBeyond9mo ago

Sure, OpenAI, I will absolutely trust you.

That's horse shit and OpenAI knows it. It means no such thing. A legal hold is just a 'preservation order'. It says absolutely nothing about other access or use.

mmooss9mo ago

OpenAI's other policies, and other laws and regulations, do have such requirements. Are they nullified because the data is held under a court order?

mrguyorama9mo ago

"The judge and court need to view this information to actually pass justice and decide the case" almost always supersedes other laws.

The GDPR does not say that you can never be proven to have done something wrong in a court of law.

1 more reply

fragmede9mo ago

why is it horse shit that OpenAI is saying they've put the files in a cabinet that only legal has access to?

FireBeyond9mo ago

They are saying a “legal hold” means that they have to keep the data but don’t worry they’re not allowed to use it or access it for any other reason.

A legal hold requires no such thing and there would be no such requirement in it. They are perfectly free to access and use it for any reason.

jamesgill9mo ago

Follow the money.

vanattab9mo ago

Protect our privacy? Or protect thier right to piracy?

charrondev9mo ago

I mean the court is ordering them to retain user conversations at least until resolution of the court case (in case there is copyrighted responses being generated?).

So user privacy is definitely implicated.

NBJack9mo ago

Agreed. I don't buy the spin.

junto9mo ago

This is disingenuous from OpenAI.

They are being challenged because NYT believes that ChatGPT was trained with copyrighted data.

NYT naively push to find a way to prove that NYT data is being used in user chats and how often.

OpenAI spin that to NYT are invading user privacy.

It’s quite transparent as to what they are doing here.

throwaway6e8f9mo ago

Agent-1, I want to legally retain all customer data indefinitely but I'm worried about a backlash from the public. Also, I'm having a bunch of problems with the NYT accusing us of copyright violation. Give me a strategy to resolve these issues so that I win in the long term.

tiahura9mo ago

Every concerned ChatGPT user should file an emergency motion to intervene and request for stay of the order. ChatGPT can help you draft the motion and proposed order, just give it a copy of the discovery order. The SDNY has a very helpful pro se hotline.

The order the judge issued is irresponsible. Maybe ChatGPT did get too cute in its discovery responses, but the remedy isn’t to trample the rights of third parties.

j / k navigate · click thread line to collapse

324 comments

_jab9mo ago

> How will you store my data and who can access it?

> Only a small, audited OpenAI legal and security team would be able to access this data as necessary to comply with our legal obligations.

Some of the other language in this post, like repeatedly calling the lawsuit "baseless", really makes this just read like an unconvincing attempt at a spin piece. Nothing to see here.

tptacek9mo ago

VanTheBrand9mo ago

2 more replies

mhitza9mo ago

My understanding is that they have to keep chats based on an order, *as a result of their previous accidental deletion of potential evidence in the case*[0].

[0] https://techcrunch.com/2024/11/22/openai-accidentally-delete...

[1] https://help.openai.com/en/articles/8809935-how-to-delete-an...

ofjcihen9mo ago

They should include the part where the order is a result of them deleting things they shouldn’t have then. You know, if this isn’t spin.

1 more reply

mmooss9mo ago

> It's not an attempt to spin the lawsuit; it's about reassuring their customers.

It can be both. It clearly spins the lawsuit - it doesn't present the NYT's side at all.

2 more replies

conartist69mo ago

lxgr9mo ago

If the stored data is found to be relevant to the lawsuit during discovery, it becomes available to at least both parties involved and the court, as far as I understand.

sashank_15099mo ago

Obviously openAI’s point of view will be their point of view. They are going to call this lawsuit baseless, they would not be fighting it or else.

ivape9mo ago

2 more replies

hiddencost9mo ago

I am not an Open AI stan, but this needs to be responded to.

The first principle of information security is that all systems can be compromised and the only way to secure data is to not retain it.

This is like saying "well I know they didn't want to go sky diving but we forced them to go sky diving and they died because they had a stroke mid air, it's their fault they died.".

Anyone who makes promises about data security is at best incompetent and at worst dishonest.

nhecker9mo ago

Data is a toxic asset. -- https://www.schneier.com/essays/archives/2016/03/data_is_a_t...

JohnKemeny9mo ago

> Anyone who makes promises about data security is at best incompetent and at worst dishonest.

Shouldn't that be "at best dishonest and at worst incompetent"?

I mean, would you rather be a competent person telling a lie or an incompetent person believing you're competent?

1 more reply

pritambarhate9mo ago

May be because you are not OpenAI user. I am. I find it useful and I pay for it. I don't want my data to be retained beyond what's promised in the Terms of Use and Privacy Policy.

conartist69mo ago

You live on a pirate ship. You have no right to ignore the ethics and law of that just because you could be hurt in conflict related to piracy

DrillShopper9mo ago

The OpenAI Privacy Policy specifically allows them to keep data as required by law.

mmooss9mo ago

> who don't even care about NYT's content or bypassing their paywalls.

If the NYT is right - I can only guess - then you are benefitting from the NYT IP. Why should you get that without their consent and for free - because you don't care?

> (jeapordizes)

molf9mo ago

It would help tremendously if OpenAI would make it possible to apply for zero data retention (ZDR). For many business needs there is no reason to store or log any request at all.

We have applied multiple times and have yet to receive ANY response. Reading through the forums this seems very common.

miles9mo ago

> I get that approval needs to be given, and that there are barriers to entry.

Why is approval necessary, and what specific barriers (before the latest ruling) prevent privacy and no logging from being the default?

AlecSchueler9mo ago

> what specific barriers (before the latest ruling) prevent privacy and no logging from being the default?

Product development?

ArnoVW9mo ago

My understanding is that they log 30 days by default, for handling of bugs. And that you can request 0 days. This is from their documentation

lcnPylGDnU4H9OF9mo ago

> And that you can request 0 days.

Right but the problem they're having is that the request is ignored.

pclmulqdq9mo ago

The missing ingredient is money.

jewelry9mo ago

not just money. How are you going to support this client’s support ticket if there is no log at all?

1 more reply

belter9mo ago

If this stands I dont think they can operate in the EU

bunderbunder9mo ago

I highly doubt this court order affects people using OpenAI services from the EU, as long as they're connecting to EU-based servers.

1 more reply

1vuio0pswjnm79mo ago

"You can also request zero data retention (ZDR) for eligible endpoints if you have a qualifying use-case. For details on data handling, visit our Platform Docs page."

https://openai.com/en-GB/policies/row-privacy-policy/

1. You can request it but there is no promise the request will be granted.

Defaults matter. Silicon Valley's defaults are not designed for privacy. They are designed for profit. OpenAI's default is retention. Outputs are saved by default.

It is difficult to take the arguments in their memo ISO objection to the preservation order seriously. OpenAI already preserves outputs by default.

lmm9mo ago

What's the betting that they just write it on the website and never actually implemented it?

sigmoid109mo ago

supriyo-biswas9mo ago

[1] https://ssdeep-project.github.io/ssdeep/index.html

[2] https://joshleeb.com/posts/content-defined-chunking.html

delusional9mo ago

I haven't been able to find any of the supporting documents, but the court order makes it seem like OpenAI has been unhelpful in producing any alternative during the conversation.

I think it's quite likely that OpenAI has taken the PR route instead of seriously engaging with any way to constructively honor the request for retention of data.

paxys9mo ago

Yeah, try explaining any of these words to a lawyer or judge.

sthatipamala9mo ago

The judges in these technical cases can be quite sophisticated and absolutely do learn terms of art. See Oracle v. Google (Java API case)

1 more reply

fc417fc8029mo ago

I thought that's what GPT was for.

m4639mo ago

"you are a helpful law assistant."

landl0rd9mo ago

"You are a long-suffering clerk speaking to a judge who's sat the same federal bench for two decades and who believes 'everything is computer' constitutes a deep technical insight."

LandoCalrissian9mo ago

Trying to actively circumvent the intention of a judges order is a pretty bad idea.

Aeolun9mo ago

That’s not circumvention though. The intent of the order is to be able to prove that ChatGPT regurgitates NYT content, not to read the personal communications of all ChatGPT users.

girvo9mo ago

Deeply, deeply so. In fact so much so that people who suggest them show they've (luckily) not had to interact with the legal system much. Judges take an incredibly dim view of that kind of thing haha

bigyabai9mo ago

landl0rd9mo ago

Of course I can't even begin trying to prove you wrong. You're making an unfalsifiable statement. You're pointing to the Russel's Teapot of sigint.

Snowden's leaks mentioned the NSA tapping inter-DC links of Google and Yahoo, so I doubt if they had to tap links that there's a ton of voluntary cooperation.

I'd also point out that trying to parse the unabridged prodigious output of the SlopGenerator9000 is a really hard task unless you also use LLMs to do it.

8 more replies

7speter9mo ago

Maybe I’m wrong, and maybe this was discussed previously, but of course openai keeps our data, they use it for training!

1 more reply

rl39mo ago

>Of course it's backdoored, you can't even begin to try proving me wrong.

On the contrary.

>Maybe I'm alone, but a pinkie-promise from Sam Altman does not confer any assurances about my data to me.

I think you're being unduly paranoid. /s

https://www.theverge.com/2024/6/13/24178079/openai-board-pau...

https://www.wsj.com/tech/ai/the-real-story-behind-sam-altman...

farts_mckensy9mo ago

Think of all the complete garbage interactions you'd have to sift through to find anything useful from a national security standpoint. The data is practically obfuscated by virtue of its banality.

3 more replies

sega_sai9mo ago

lxgr9mo ago

Of course it's out of self-serving interests, but I find it hard to disagree with OpenAI on this one.

JumpCrisscross9mo ago

> with almost no regard for the privacy of parties not involved to a given dispute

Third-party privacy and relevance is a constant point of contestion in discovery. Exhibit A: this article.

1 more reply

thinkingtoilet9mo ago

The privacy onus is entirely on the company. If Open AI is concerned about user privacy then don't collect that data. End of story.

2 more replies

Arainach9mo ago

(1) With limited well scoped exclusions for lawyers, medical records, erc.

9 more replies

visarga9mo ago

https://harvardlawreview.org/blog/2024/04/nyt-v-openai-the-t...

moefh9mo ago

Another way of looking at it is that they lost that case over 20 years ago, and have been building their business model for 20 years accordingly.

In other words, they want everyone to be forced to follow the same rules they were forced to follow 20 years ago.

tptacek9mo ago

They're a party to the case! Saying it's baseless isn't a "smear". There is literally nothing else they can say (other than something synonymous with "baseless", like "without merit").

lucianbr9mo ago

Oh they definitely can say other things. It's just that it would be inconvenient. They might lose money.

3 more replies

mmooss9mo ago

They could say nothing about the merits of the case.

wyager9mo ago

Lots of people abuse the legal system in various ways. They don't get a free pass just because their abuse is technically legal itself.

tootie9mo ago

eviks9mo ago

And if NYT has no case, but the court approves it, is that still bizarre?

tomhow9mo ago

Related discussion:

OpenAI slams court order to save all ChatGPT logs, including deleted chats - https://news.ycombinator.com/item?id=44185913 - June 2025 (878 comments)

hombre_fatal9mo ago

You know how it's always been a meme that you'd be mortally embarrassed if your browser history ever leaked?

Imagine how much worse it is for your LLM chat history to leak.

It's even worse than your private comms with humans because it's a raw look at how you are when you think you're alone, untempered by social expectations.

vitaflo9mo ago

WTF are you asking LLMs and why would you expect any of it to be private?

threecheese9mo ago

This product is positioned as a personal copilot, and future iterations (based on leaked plans, may or may not be true) as a wholly integrated life assistant.

Why would a customer expect this not to be private? How can one even know how it could be used against them, when they do t even know what’s being collected or gleaned from collected data?

phendrenad29mo ago

cedws9mo ago

Lot of people using ChatGPT as a therapist. I tried it but it was too sycophantic.

hombre_fatal9mo ago

It's not that the convos are necessarily icky.

At the very least, I'd wager that it reveals that bit of true helpless patheticness inherent in all of us that we try so hard to hide.

Show me your LLM chat history and I will learn a lot about your personality. Nothing else compares.

3 more replies

ofjcihen9mo ago

“Write a song in the style of Slipknot about my dumb inbred dogs. I love them very much but they are…reaaaaally dumb.”

To be fair the song was intense.

conartist69mo ago

Hey OpenAI! In your "why is this happening" you left some bits out.

You make it sound like they're mad at you for no reason at all. How unreasonable of them when confronted with such honorable folks as yourselves!

delusional9mo ago

I have no time for this circus.

You do not get to claim to protect the privacy of the customers of your illegal venture.

energy1239mo ago

> Consumer customers: You control whether your chats are used to help improve ChatGPT within settings, and this order doesn’t change that either.

sib3019mo ago

Can you please elaborate?

energy1239mo ago

To opt-out of your data being trained on, you need to go to https://privacy.openai.com and click the button "Make a Privacy Request".

1 more reply

curtisblaine9mo ago

Yes, could you please explain why toggling "Improve model for everyone" off doesn't do anything and provide a link to this off-portal app that you mention?

amluto9mo ago

JimDabell9mo ago

I believe Apple’s agreement includes this, at least when a user isn’t signed into an OpenAI account:

— https://www.apple.com/legal/privacy/data/en/chatgpt-extensio...

fc417fc8029mo ago

> I’m sure after telling their users it’s private, they won’t be happy about everything getting logged,

The ZDR APIs are not and will not be logged. The linked page is clear about that.

singron9mo ago

If OpenAI cared about our privacy, ZDR would be a setting anyone could turn on.

atleastoptimal9mo ago

I've always assumed that anything sent to any company's hosted API will be logged forever. To assume otherwise always seemed naive, like thinking that apps aren't tracking your web activity.

lxgr9mo ago

Assuming the worst is wise, settling for the worst case outcome without any fight seems foolish.

fragmede9mo ago

privacy nhilism is a decision all on its own

morsch9mo ago

Barrin929mo ago

yoaviram9mo ago

that_was_good9mo ago

Except all users can opt out. Am I missing something?

It says here:

> If you are on a ChatGPT Plus, ChatGPT Pro or ChatGPT Free plan on a personal workspace, data sharing is enabled for you by default, however, you can opt out of using the data for training.

Enterprise is just opt out by default...

https://help.openai.com/en/articles/8983130-what-if-i-want-t...

bartvk9mo ago

1 more reply

atoav9mo ago

Not sharing you data with other users does not mean the data of a deleted chat are gone, those are very likely two completely different mechanisms.

And whether and how they use your data for their own purposes isn't touched by that either.

agos9mo ago

what about all the rest of the data they use for training, there's no opt out from that

baxtr9mo ago

This is a typical "corporate speak" / "trustwahsing" statement. It’s usually super vague, filled with feel-good buzzwords, with a couple of empty value statements sprinkled on top.

paxys9mo ago

> Does this court order violate GDPR or my rights under European or other privacy laws?

> We are taking steps to comply at this time because we must follow the law, but The New York Times’ demand does not align with our privacy standards. That is why we’re challenging it.

That's a lot of words to say "yes, we are violating GDPR".

38362936489mo ago

No, they're not, because the GDPR has an explicit exception for when a court orders that a company keeps data for discovery. It'd only be a GDPR violation if it's kept after this case is over.

lompad9mo ago

This is not correct.

So if, and only if, an agreement between the US and the EU allows it explicitly, it is legal. Otherwise it is not.

dragonwriter9mo ago

kelvinjps9mo ago

Maybe the will ot store the chats of the European users?

esafak9mo ago

Could a European court not have ordered the same thing? Is there an exception for lawsuits?

lxgr9mo ago

There is, but I highly doubt a European court would have given such an order (or if they did, it would probably be axed by a higher court pretty quickly).

Looking at the actual data seems much more invasive than that and, in my (non-legally trained) estimate doesn't seem like it would stand a chance at least in higher courts.

1 more reply

nraynaud9mo ago

Isn't Altman collecting millions of eye scans? Since when did he care about privacy?

WorldPeas9mo ago

So how is this going to impact cursor's privacy mode, which is required by many companies for compliant usage of AI editors? For the uninitiated, in the web console this looks like:

Privacy mode (enforced across all seats)

OpenAI Zero-data-retention (approved)

Anthropic Zero-data-retention (approved)

Google Vertex AI Zero-data-retention (approved)

xAi Grok Zero-data-retention (approved)

did this just open another can of worms?

qmarchi9mo ago

Likely, they're using OpenAI's Zero-Retention APIs where there's never data stored in the first place.

So nothing?

JumpCrisscross9mo ago

> OpenAI's Zero-Retention APIs

Do we know if the court order covers these?

1 more reply

8note9mo ago

at least, openai zero-data-retention will by court order be full retention.

im excited that the law is going to push for local models

blerb7959mo ago

The linked page specifically mentions that these ZDR APIs are not impacted.

> This does not impact API customers who are using Zero Data Retention endpoints under our ZDR amendment.

dangus9mo ago

If you don’t retain that data you’re destroying evidence for the case.

It’s not like the data is going to be given to anyone, it’s only gong to be used for limited legal purposes for the lawsuit (as OpenAI confirms in this article).

lxgr9mo ago

It absolutely goes against norms in many countries other than the US, and the data of residents/citizens of these countries are affected too.

> It’s not like the data is going to be given to anyone, it’s only gong to be used for limited legal purposes for the lawsuit (as OpenAI confirms in this article).

Nobody other than both parties to the case, their lawyers, the court, and whatever case file storage system they use. In my view, that's already way too much given the amount and value of this data.

dangus9mo ago

Countries other than the US aren't part of this lawsuit. ChatGPT operates in the US under US law. I don't know if they have separated data storage for other countries.

1 more reply

danenania9mo ago

However you feel about OpenAI, this is not a good precedent for user privacy and security.

dangus9mo ago

I'm confused at how you think that NYT isn't going through due process and attempting to win a lawsuit.

The court isn't saying "preserve this data forever and ever and compromise everyone's privacy," they're saying "preserve this data for the purposes of this court while we perform an investigation."

1 more reply

fc417fc8029mo ago

1 more reply

tptacek9mo ago

And honestly, OpenAI should have just not used copyrighted data illegally and they would have never had this problem

The whole premise of the lawsuit is that they didn't do anything unlawful, so saying "just do what the NYT wanted you to do" isn't interesting.

dangus9mo ago

No, you're misinterpreting how information discovery and the court system works.

It's not "just do what NYT wanted you to do," it's "do what the court orders you to do based on a lawsuit brought by a plaintiff and argued to the court."

I suggest you read the court filing: https://nytco-assets.nytimes.com/2023/12/NYT_Complaint_Dec20...

CjHuber9mo ago

dataflow9mo ago

I'm confused, how does this not affect Enterprise or Edu? They clearly possess the data, so what makes them different legally?

oxw9mo ago

Enterprise has an exemption granted by the judge

> When we appeared before the Magistrate Judge on May 27, the Court clarified that ChatGPT Enterprise is excluded from preservation.

dataflow9mo ago

Oh I missed that part, thanks. I wonder why. I guess the judge assumes it isn't being used for copyright infringement, but other plans might be?

2 more replies

Kiyo-Lynn9mo ago

Caelus99mo ago

mosdl9mo ago

Its funny that OpenAI is complaining, they don't mind saying copyright doesn't apply to them if it makes them money.

ivape9mo ago

How much could the NYT back catalog be worth? Just buy it, ask the Saudis.

tptacek9mo ago

You mean, like, a pretty big fraction of everybody who comments on this site?

mmooss9mo ago

People here advocate for private use, not profit-making corporate use.

rasengan9mo ago

The internet is the battle of the narratives.

kingkawn9mo ago

Once the data is kept it is a matter of time til a new must-try use for it will be born

dumbmrblah9mo ago

So is this for all chats going forward or does it include conversations retroactively?

steve_adams_869mo ago

Presumably moving forward, because otherwise the data retention policies wouldn't have been followed correctly (from what I understand)

tmaly9mo ago

I wonder if this would affect temporary chats too?

65109mo ago

The harm this is doing and will do (regardless) seems to exceed the value of the NYT.

If a company is subject to a US court order that violates EU law, the company could face legal consequences in the EU for non-compliance with EU law.

The GDPR mandates specific consent and legal bases for processing data, including sharing it.

Assuming it is legal to share it for legal purposes one cant sufficiently anonymize the data. It needs to be accompanied by user data that allows requests to download it and for it to be deleted.

I wonder what the fine would be if they just delete it per user agreement.

I also wonder, could one, in the US, legally promise the customer they may delete their data then chose to keep it indefinitely and share it with others?

wand3r9mo ago

Does anyone know how this can be enforced?

mmooss9mo ago

imiric9mo ago

> the court can at least enforce massive penalties

A.k.a. the cost of doing business.

1 more reply

landonxjames9mo ago

sigilis9mo ago

lxgr9mo ago

Does anybody know if this also applies to "temporary chats" on ChatGPT?

Given that it's not explicitly mentioned as data not being affected, I'm assuming it is.

miles9mo ago

https://arstechnica.com/tech-policy/2025/06/openai-says-cour...

mediumsmart9mo ago

Its a newspaper. They are sold for a price, not to one person and they dont come with an nda. They become part of history and Society.

dvt9mo ago

> Does this court order violate GDPR or my rights under European or other privacy laws?

> We are taking steps to comply at this time because we must follow the law, but The New York Times’ demand does not align with our privacy standards. That is why we’re challenging it.

So basically no, lol. I wonder if we'll see the GDPR go head-to-head with Copyright Law here, that would be way more fun than OpenAI v NYT.

udev40969mo ago

The irony is palpable here

john2x9mo ago

Does this mean that if I can get ChatGPT to generate copyrighted text, they'll get in trouble?

FireBeyond9mo ago

Sure, OpenAI, I will absolutely trust you.

That's horse shit and OpenAI knows it. It means no such thing. A legal hold is just a 'preservation order'. It says absolutely nothing about other access or use.

mmooss9mo ago

OpenAI's other policies, and other laws and regulations, do have such requirements. Are they nullified because the data is held under a court order?

mrguyorama9mo ago

"The judge and court need to view this information to actually pass justice and decide the case" almost always supersedes other laws.

The GDPR does not say that you can never be proven to have done something wrong in a court of law.

1 more reply

fragmede9mo ago

why is it horse shit that OpenAI is saying they've put the files in a cabinet that only legal has access to?

FireBeyond9mo ago

They are saying a “legal hold” means that they have to keep the data but don’t worry they’re not allowed to use it or access it for any other reason.

A legal hold requires no such thing and there would be no such requirement in it. They are perfectly free to access and use it for any reason.

jamesgill9mo ago

Follow the money.

vanattab9mo ago

Protect our privacy? Or protect thier right to piracy?

charrondev9mo ago

I mean the court is ordering them to retain user conversations at least until resolution of the court case (in case there is copyrighted responses being generated?).

So user privacy is definitely implicated.

NBJack9mo ago

Agreed. I don't buy the spin.

junto9mo ago

This is disingenuous from OpenAI.

They are being challenged because NYT believes that ChatGPT was trained with copyrighted data.

NYT naively push to find a way to prove that NYT data is being used in user chats and how often.

OpenAI spin that to NYT are invading user privacy.

It’s quite transparent as to what they are doing here.

throwaway6e8f9mo ago

tiahura9mo ago

The order the judge issued is irresponsible. Maybe ChatGPT did get too cute in its discovery responses, but the remedy isn’t to trample the rights of third parties.

j / k navigate · click thread line to collapse