The New York Times is considering suing Open AI over IP rights


I’ve side eyed most major newspapers in recent years, including The New York Times, for how they’ve handled reporting on everything from Orange Indictment Guy, to Black Lives Matter, to the pandemic. But a free, robust press is a must for healthy democracy, and so while I don’t love everything they publish, I’m still glad NYT exists. It feels as if journalism is getting attacked on all sides: declining trust in the press, the death of print, misinformation everywhere, hostility towards journalists–and now, AI. AI is another threat to journalism and an independent press because it has no guardrails and spits out false information. A recent Stanford study showed that while it improves in some areas, it’s getting dumber over time when it comes to some math problems, and is worse at answering nuanced questions. But AI might just have an Achilles heel: copyright infringement. ChatGPT and other large language models have “scraped’ huge swathes of content from The New York Times (and other publications) without consent. Now, NYT is possibly suing OpenAI, the company that created ChatGPT.

The New York Times and OpenAI could end up in court.

Lawyers for the newspaper are exploring whether to sue OpenAI to protect the intellectual property rights associated with its reporting, according to two people with direct knowledge of the discussions.

For weeks, the Times and the maker of ChatGPT have been locked in tense negotiations over reaching a licensing deal in which OpenAI would pay the Times for incorporating its stories in the tech company’s AI tools, but the discussions have become so contentious that the paper is now considering legal action.

The individuals who confirmed the potential lawsuit requested anonymity because they were not authorized to speak publicly about the matter.

A lawsuit from the Times against OpenAI would set up what could be the most high-profile legal tussle yet over copyright protection in the age of generative AI.

The possible case against OpenAI: So-called large language models like ChatGPT have scraped vast parts of the internet to assemble data that inform how the chatbot responds to various inquiries. The data-mining is conducted without permission. Whether hoovering up this massive repository is legal remains an open question.

If OpenAI is found to have violated any copyrights in this process, federal law allows for the infringing articles to be destroyed at the end of the case.

In other words, if a federal judge finds that OpenAI illegally copied the Times’ articles to train its AI model, the court could order the company to destroy ChatGPT’s dataset, forcing the company to recreate it using only work that it is authorized to use.

Federal copyright law also carries stiff financial penalties, with violators facing fines up to $150,000 for each infringement “committed willfully.”

“If you’re copying millions of works, you can see how that becomes a number that becomes potentially fatal for a company,” said Daniel Gervais, the co-director of the intellectual property program at Vanderbilt University who studies generative AI. “Copyright law is a sword that’s going to hang over the heads of AI companies for several years unless they figure out how to negotiate a solution.”

[From NPR]

You know what? I hope the NYT takes OpenAI to the damn cleaners. I hope those tech bros are forced to fold, taking the entire AI industry with them. The case against them seems relatively cut-and-dry to me but I’m not an attorney. I’m sure OpenAI will come up with some slippery defense about the AI not being capable of “willfully” copying NYT because it’s not human. There are so many ethical problems with AI and copyright infringement is just one of them. There’s evidence that AI is promoting eating disorders and sending people “thinspo,” which is heinous. Tech companies aren’t stopping it from doing that. It’s bad for the environment because it takes massive amounts of energy and water to work. And AI still needs people to run it, and guess what–it looks like a lot of those people, mostly based in the Global South, are not earning a living wage for continuing to train the AI. I’ve been obsessed with the movie Oppenheimer this summer–I’ve seen it three times–and while it’s obviously about nuclear weapons, the themes of the movie also map neatly onto AI. I feel that AI has the potential to be more destructive than we realize. Just because a technology is powerful, that does not mean it is good. And just because a new technology is an impressive scientific discovery, that doesn’t make it ethical–or wise–to use it.

Photos credit: Marco Lenti and Emiliano Vittoriosi on Unsplash and Matheus Bertelli on Pexels

You can follow any responses to this entry through the RSS 2.0 feed.

6 Responses to “The New York Times is considering suing Open AI over IP rights”

Comments are Closed

We close comments on older posts to fight comment spam.

  1. LegallyBrunette says:

    “ Just because a technology is powerful, that does not mean it is good. And just because a new technology is an impressive scientific discovery, that doesn’t make it ethical–or wise–to use it.”

    Thank you! Big tech was allowed to get so powerful, so quickly, because it was not regulated. It has been terrifying to watch.

    • Snoozer says:

      Part of the issue is that so many lawmakers are old and don’t understand even the basics about tech. Some of the questions Senate Committees ask the heads of Big Tech are mortifying. How can you legislate something you can’t even remotely grasp?

  2. Shawna says:

    I’m not an IP expert, but I have a good grasp on fair use. Open AI could try to launch a defense based on their usage being transformational and non-consumptive. That is, their models, algorithms, and text outputs transform the base data in a way that creates something new (adds new intellectual property). The training datasets become unrecognizable and are not available to normal users. Finally, they in no way compete with the NYT (they are not news outlets, nor do they give users direct access to readable NYT files). However, the ability to ingest copyrighted content at such a large scale could be a plausible reason for a judge to re-examine fair use policies.

    That said, I’m on the side of human creators. Their work is far better than AI generated content. I don’t want people getting used to their vastly inferior and unreliable results and therefore stop wanting or recognizing excellent, original writing and art. What we need training in is critical literacy (that will let us tell the difference) and empathy (that will call for fair treatment of content producers and low-paid digital laborers in the global South).

    • Jais says:

      Thanks for providing info on the possible defense. I’m on side with the human creators too. It’s all so wild. I feel like I’ve been sleeping on this topic and not paying attention and now it’s everywhere and scary af.

  3. Concern Fae says:

    Worked for a university where students were building the sorts of datasets used in OpenAI. If you are doing it purely for research, you can claim fair use on copyright issues. The problem is that business sees the results in the papers that are published and want to replicate it. Students are hired for the work they have have done and want to keep doing it. So what was legit as university research turns into very much not OK for profit businesses.

    And universities can end up with financial stakes in student research and companies. All of which creates huge ethical problems.

  4. Lurker25 says:

    1) almost every single sci Fi writers predictions since the dawn of time has come true. From Jules Vern predicting submarines to William Gibson predicting the Internet. Those who tackle artificial intelligence inevitably predict dire consequences for humanity and the planet

    2) the way the tech press fangirls about AI is a pathetic dereliction of journalism. It’s not “intelligence” if all you’ve done is to program a machine by brute force ingestion of so much data that it vomits out responses on cue. It says more about the humans ability to create tech that can handle this much data, it says very little about the result. “creativity” means of originality. Regurgitating pre-existing work is not creative. AI “mashups” only look impressive bc we’ve accepted corporate promotion of hackneyed work by humans.
    A relative asked chatgp to write a story using my child and his pet and our city. She was so impressed with the result. Thing is – she doesn’t live in our city. Neither does chatgp, which you’d think would have been programmed with enough info to know that we don’t be have alleys, the city isn’t walkable, it’s laughable to think of my child’s pet breed attempting chatGPs imaginary rescue.. These and dozens of other details that ANY human writer would have thought through, using the same cues that were fed into AI.

    It’s so sad that corporations dont value the incredible computers that already exist in their employees heads. The brains that manage the tasks of massaging bosses egos, getting through school drop offs and traffic and getting to work on time, making dinner while answering work emails, and a thousand other difficulties that involve multitasking and soft skills that AI cannot even fathom.