NYT asks for ChatGPT logs: What are the risks?

ChatGPT users may be rightfully concerned about the New York Times’ demand to share private conversations with lawyers. Yet, the use of chatbots itself is what poses major privacy risks.

Magistrate Judge Ona Wang ordered on November 7th that it was “appropriate” for OpenAI to turn over the 20 million ChatGPT logs to The New York Times’ lawyers as part of the ongoing copyright infringement lawsuit.

The judge stated that OpenAI failed to explain why the privacy of ChatGPT users was not already protected, given that lawyers and experts reviewing the case material must comply with stringent measures.

In a blog post published on Thursday, OpenAI slammed the demand, claiming that it “disregards long-standing privacy protections, breaks with common-sense security practices.”

“If The Times succeeds in its demand, we will be forced to hand over the very same data we’re protecting – your data – to third parties, including The Times’ lawyers and paid consultants,” the company said.

The newspaper’s spokesperson rejected the claims, emphasizing that the court ordered OpenAI to provide a sample of chats that are anonymized by OpenAI itself under a legal protective order.

Don't miss our latest stories on Google News

Add us as your Preferred Source on Google.

Full anonymization may be impossible

Karni Chagal-Feferkorn, an assistant professor at the University of South Florida, says it may seem inconsistent for OpenAI to argue that disclosing user conversations to the NYT poses privacy risks, given that OpenAI itself retains and uses such conversations for model training.

Yet, she says OpenAI’s argument is not without merit, because even when personally identifiable information (PII) is removed, sensitive details can sometimes be inferred from context, making full anonymization more challenging than it appears to be.

“If a user with a rare occupation discusses work and references local landmarks, cross-referencing those details could reveal their identity even when details such as names and addresses are removed,” Chagal-Feferkorn tells Cybernews.

An icon of ChatGPT app — Image by Nikolas Kokovlis/NurPhoto/Getty Images

If the anonymized data cannot be used to identify ChatGPT users, it still raises the question about breaching users’ trust and expectations, says Erin Illman, a partner at Bradley Arant Boult Cummings LLP.

In this case, there is no clear line between what data can be obtained under “legal process” or as part of a legal dispute, and what is reasonable and relevant to provide in these circumstances.

“Ultimately, the risk is that the outcome fails to balance the privacy expectations of individuals who had no reason to believe this information would be provided to a third party when it was originally disclosed against the reasonable need to defend a business and property right to proprietary information,” Illman says.

OpenAI’s use of data may be a bigger threat

The privacy risks associated with OpenAI’s own use of user data are, in some respects, greater than those posed by NYT’s proposed access, Chagal-Feferkorn says.

OpenAI retains vast quantities of conversational data for purposes such as training and personalization, while NYT would access only a small sample of conversations with PII removed. These logs would be available only to attorneys under strict conditions.

Business Insider reported that NYT lawyers involved in the case can only review ChatGPT’s source code on a computer unconnected to the internet, which is located in a secure room where bringing electronic devices is prohibited.

The Times' lawyers reportedly can share their notes with up to five outside consultants to help them understand what the code does.

However, Chagal-Feferkorn says ChatGPT users have consented to OpenAI’s use of their conversations under its stated terms of service, and not necessarily envisioned that their data would be shared with adversarial third parties in litigation.

“This creates a different privacy expectation and potentially a different ethical and legal threshold,” she says.

If a user with a rare occupation discusses work and references local landmarks, cross-referencing those details could reveal their identity even when details such as names and addresses are removed.
Karni Chagal-Feferkorn

OpenAI said it asked the court to reject the NYT’s demand to hand over ChatGPT logs. However, even if the court favors the tech company, it doesn’t mean conversations are never shared with third parties.

OpenAI has previously admitted that it would share conversations with law enforcement and a suicide and crisis hotline in case of an emergency involving a danger of death.

Experts strongly advise against sharing confidential information with chatbots like ChatGPT not only because it is used to train models but also because of the risk of data breach.

In 2023, payment-related information of some ChatGPT Plus subscribers was briefly exposed due to a bug in an open-source library.

Some ChatGPT users were shocked to learn that their private conversations were indexed on Google after they clicked “share” on the chatbot, Fast Company reported in August. OpenAI has removed the functionality since.

Paul Bischoff, a consumer privacy advocate at Comparitech, says almost every privacy policy makes exceptions for court-ordered access to private data. Therefore, the “longstanding protections” argument isn’t on ChatGPT's side.

He tells Cybernews, “ChatGPT records every interaction, and users should be aware of that going in.”

Why does NYT sue OpenAI?

The New York Times filed a lawsuit against OpenAI in 2023, alleging that the company unlawfully used the newspaper’s articles to train large language models (LLMs), particularly ChatGPT, violating copyright law.

Additionally, the lawsuit claims that ChatGPT can generate the NYT’s content at no cost, making readers less likely to visit their website and resulting in financial losses for The Times.

Many other news outlets in the United States and Canada followed suit, suing the tech company for allegedly using their articles to train models.

The widespread use of LLMs and an increasing reliance on Google AI overviews has resulted in shrinking news media traffic. Advertising is the major source of revenue for these companies, so the drop in traffic directly affects their budgets.

Since then, OpenAI and Perplexity AI have launched revenue-sharing partnerships with multiple major media companies.

Unlock more exclusive Cybernews content on YouTube

Should users worry about NYT asking to turn over ChatGPT logs?

More from Cybernews

Full anonymization may be impossible

OpenAI’s use of data may be a bigger threat

Why does NYT sue OpenAI?