Embed from Getty Images
One failing of mine is that I spend so much time focused on how we’re nearing total destruction via climate change, that I don’t give enough attention to how we’re doing so through the rapid developments in artificial intelligence. Allow me to remedy that. Earlier this year Meta (the artist formerly known as Facebook) released LLaMA, which stands for Large Language Model Meta AI. From what my Luddite brain can gather, the program is essentially Meta’s version of ChatGPT. The software is “trained” in how to respond by being given texts to absorb. Well funny story, it seems Meta has admitted to feeding LLaMA literary works that are copyrighted, and the authors are not happy. Michael Chabon and David Henry Hwang are among a slew of writers who have filed a class action lawsuit against Meta for copyright infringement:
Pulitzer Prize-winner Michael Chabon and Tony-winning playwright David Henry Hwang are among a group of writers that filed a class action lawsuit against Meta in San Francisco federal court for having “copied and ingested” their works to train LLaMA AI platform.
Plaintiffs also including authors Matthew Klam, Rachel Louise and Ayelet Waldman are seeking class action status for the suit, which says their copyrighted books appear in the dataset that Meta has admitted to using to train LLaMA.
“Plaintiffs and Class members did not consent to the use of their copyrighted books as training materials for LLaMA,” said the group, which filed a similar suit last week against ChatGPT parent OpenAI.
Comedian Sarah Silverman sued Meta and OpenAI this summer for copyright infringement.
As AI grows, so do lawsuits by the creative community against its large language model. That’s an AI software program designed to produce convincingly natural text in response user prompts. “Rather than being programmed in the traditional manner [by engineers creating reams of code], a large language model is “trained” by copying massive amounts of text and extracting expressive information from it. The body of text is referred to as the training dataset,” explained the suit, filed in the U.S. District Court for the Northern District of California where the Facebook parent is based. Meta released LLaMA in Feb. of 2023.
Plaintiffs, however, have copyrights for their books and written works “and never consented to their use as training materials for LLaMA.” The works of Chabon (Wonder Boys, The Amazing Adventures of Kavalier & Clay, The Yiddish Policemen’s Union), of Hwang (M. Butterfly, Chinglish, Yellow Face, Golden Child) and works by other plaintiffs “include copyright-management information that provides information about the copyrighted work, including the title of the work, its ISBN or copyright registration number, the name of the author, and the year of publication.”
Plaintiffs alone “have been and remain the holders of the exclusive rights under the Copyright Act of 1976… to reproduce, distribute, discplay, or license the reproduction, distribution, and/or display the works identified.”
A couple weeks ago Carina covered The New York Times considering filing a similar lawsuit against OpenAI. I don’t think I can improve upon her comments on the potential danger this technology presents, so I’m gonna try a different angle here. If AI wants to learn how to “produce convincingly natural text” and how to extract expressive language from original works, it can go to school like the rest of us. Seriously, have it enroll in public school and build a solid foundation. I’d love to see AI suffer sit through hours of English language grammar lessons without shorting a circuit. Should it prove competent enough the AI can then go to college, become an English major, sign up for a seminar and find a professor to mentor its thesis paper.
I jest, but underneath it all I despair over our culture’s obsession with shortcuts. Everything in the description of LLaMA’s “training process” — and remember it’s being trained to be on par with, you know, us humans — sounds like a massive shortcut. Hopefully, it’s also an illegal shortcut that Zuck will have to pay for. Dearly.
That’s a very ingenious suggestion—to test the genAI through a traditional degree process! It would require English professors to be able to define our skills and requests in machine-readable ways, and most of us aren’t trained to do so. My question would be why “it” would “want” to do so; I motivate students to level up by tapping into their motives. Earning tech bros and investors money and prestige isn’t the normal motive! Sadly, learning management systems and educational technology firms are currently trying to lure university writing instructors to contribute their own teaching materials and skills—their pedagogical intellectual property!—to train new higher education AI tools. Terrifying. And, as you touch on, it’s not good for the environment because all these computation-heavy processes require ridiculous amounts of server space, heavy metals, etc.
I am so glad for this. Authors retain their copyrights. Script writers do not, so they cannot sue, even though it is their original work that is being sourced. Studios would have to sue. AI should only be able to use stuff that is in the public domain. I’d love for tech CEOs to sue the US government to get the duration of time before material gets into the public domain reduced, after Disney lobbied Congress for years to get it lengtheneded, and won! Studios have to start suing all AI, but right now they’re in bed with all the tech sociopaths—studio heads being sociopaths themselves. This is a terrifying time because we have dueling sociopaths leading the world in all sectors and in tech advancement. AI doesn’t learn, it is trained to plagiarize stealthily and efficiently and cover it up well. It should never be allowed to replace any creative art. Ever.
As someone who believes AI is just the next iteration of extractive, anti-human technology designed to prop up tech companies that were dependent on cheap money to appear successful, I’m thrilled to see this. I think copyright law is the best bet to kill this in its infancy.
I’m on the side of the writers, but they’re going to lose this lawsuit. Because the thing is, when we go to school, how do we learn? We read these books. And Chabon et. al. don’t have to give their permission for an entire English class to read their writing. So unless the AI is cranking out text that is exactly taken from their works or closely paraphrased, it’s not actually plagiarizing — it’s writing in the style of.
The argument is that their copyrighted work is not approved for use by Meta for purposes of growing/strengthening its business interests. A published work of literature is intended to be read by the public at large; students reading a work that was purchased for the purpose of being read is not the same as a for-profit business using that work for their own gain. Students either borrow an owned copy of the book from the school or supply their own copy (purchased or borrowed from another paid source.) It’s apples and hand grenades.
I wish them every success. It’s slimy and wrong for tech giants to try to profit off of the intellectual property of writers, and AI generated “art” is just wrong IMO.
“how do we learn? We read these books.” – Generative AI isn’t “we” in the sense you mean. The more we normalize treating code and machine processes as people, the more we’ll lose ground on our rights as humans.
Take Zuckerberg’s money. Take all of it.
Zuckerberg looks like a deep sea alien. He’s here to destroy humanity.
As a published author, I’m on the fence about AI, which in author circles is being widely bashed. I’ve used ChatGPT for brainstorming/research. I’ve also tried to give it a couple paragraphs of my own writing to mimic. What is gave back? It was HORRIBLE prose. It was like 2nd grader writing it. I immediately trashed it. The technology is not there yet. However, I do understand authors not wanting their copyrighted works data mined by FB/Meta.
Amazon did recently add a box to their upload process for books. Authors now have to state whether ANY part of their book used AI. You have to click yes or no. AI-assisted (brainstorming/research is considered a “no” but using AI to rewrite a chapter, even if you heavily edit it? That’s a “yes”). Brave new world here.
H, if an author did use AI in the process of writing their book, will that author also be liable to the writers whose work was used to “teach” AI how to write?
Best wishes to the authors.