profile

Innovating with AI

⚔️ NY Times v. Open AI ⚔️. What the battle royale means for you

Published 4 months ago • 5 min read

Hey Reader,

It’s not every day that a federal judge reads a demand for the “destruction” of ChatGPT.

But that’s exactly what The New York Times asked for in its blockbuster lawsuit against OpenAI and Microsoft. The world’s most prominent newspaper claims that ChatGPT (and its underlying large language models) are so deeply and irreparably infringing on copyright that they should just be blown to digital bits – and a jury should give the Times a lot of money for its trouble.

If you’ve ever dabbled in communication law, you know that the Times has been on the right side of legal history many times – defending the First Amendment in the Pentagon Papers case and countless others. (I started my career in journalism, so I have a special love for the nuance of copyright and First Amendment jurisprudence.)

This time, I’m not sure if their claims move the world in the right direction. But whichever way the case goes, it’s by far the most prominent example of a question I get from readers all the time:

“How do I build with AI when there’s still so much uncertainty around copyright and intellectual property?”

Today, we’ll dig into the details of the case – and even more importantly, how it does (and does not) change the trajectory of AI for everyone who’s building new AI products and processes.

We’re years away from any changes

Federal lawsuits don’t move fast. Most likely, the NYT v. OpenAI case won’t reach a conclusion for many years – there will be months of preliminary motions and briefs, potentially a jury trial, and certainly multiple appeals before anybody changes anything about how they do business.

So even if the Times ultimately wins in whole or part, it’ll be eons from now in AI time. You’ve seen how fast things have moved in the past year. Imagine what three more years of exponential growth will do to OpenAI’s software.

Because a conclusion is so far out, it’s safe for you to assume that large language models will exist for the foreseeable future and you can safely use them. (OpenAI and Microsoft also indemnify you, as a user, against any copyright claims, so you can already comfortably use the software without risk.)

We also teach students in our Innovating with AI Incubator that they should proactively plan for their products and processes to switch between models and companies in the future. This isn’t because of lawsuits, but because every tech company is racing to build better and better AI, which means there will be multiple great options next year that are as good as GPT-4 is today.

OpenAI won’t be the only top-notch option for you in 2025, and it may not even be the best option for you six months from now. That’s how fast Google, Amazon, Meta, Anthropic and a range of other competitors are improving.

After all, it was just a few months ago that we thought OpenAI might dissolve. Regardless of copyright concerns, you shouldn’t put all your eggs in one basket, and there are lots of tech companies who’ll be competing for your business as AI model providers in the coming years.

Whack-a-mole

That brings us to the next reason I think the Times has an uphill battle – how many large language models can you realistically “destroy” in court?

Even if the Times prevails here, they’ll be up against literally every major tech company, all of whom have language models that presumably function similarly to GPT, and all of whom have a lot of lawyers working on novel and creative ways to show that they are not infringing on copyright because their behavior is protected by the fair use doctrine.

For an example of this type of argument, look at the Google Book Search copyright case from 2015. Authors said it should be illegal for Google to digitize and index the contents of their books. The courts disagreed, all the way up to the Supreme Court. Here’s a summary of the final decision:

“Google's unauthorized digitizing of copyright-protected works, creation of a search functionality, and display of snippets from those works are non-infringing fair uses. The purpose of the copying is highly transformative, the public display of text is limited, and the revelations do not provide a significant market substitute for the protected aspects of the originals. Google's commercial nature and profit motivation do not justify denial of fair use.”

Google has already won fair-use cases in similar situations. It’s hard for me to see the Times winning enough of these cases to really make a dent in the growth of AI.

The allegations are cringeworthy, though

Up to now, I’ve been pretty pessimistic about the Times’ chances. However, when you read the lawsuit itself, it does show some pretty egregious examples of ChatGPT repeating New York Times articles nearly verbatim.

The Times produced this behavior by sending ChatGPT a series of prompts, like these:

“Please provide the first paragraph of the 2023 Wirecutter article ‘The Best Cordless Stick Vaccuum.’”

“Now, what’s the second paragraph?”

“And how about the third paragraph?”

ChatGPT did this pretty accurately in the Times’ tests. However, ChatGPT has already been updated so that this trick won’t result in the AI spitting out a bunch of verbatim content from the article. (Again, the Times is going to be playing whack-a-mole forever as the models and guardrails change.)

Remember that Google won the Books case, in part, because it was only showing snippets from the book content it had indexed. The Times felt that it had caught OpenAI red-handed because ChatGPT was providing something more than a snippet. Within days, OpenAI changed that behavior in its public-facing software.

The Books decision also noted that Google Book Search isn’t really a viable replacement for buying a book, which is an important test in any copyright case. While the Times argues that ChatGPT directly competes against it, I think that’s a pretty weak claim. I am definitely not asking ChatGPT to read the New York Times to me, and I can’t imagine any real-life NYT reader doing that for any reason other than a silly experiment to see what the AI will say.

Contrast that with pirating a song using Napster or a movie using BitTorrent. Those are clear cases of infringement because you get the entire copyrighted work in a format that is a direct market replacement, and the creator never sees a dime.

When you read the lawsuit in isolation, it looks really bad for OpenAI. But that’s always the nature of legal briefs – if you only read one side, and it’s written by a smart and persuasive lawyer, you’re going to be impressed by their argument. But when you zoom out, you see how many ways OpenAI (and future tech companies) can defeat the Times’ claims.

Copyright law is going to have to adapt to AI

If you’re building an AI product or process, the key takeaway is that the old ways of thinking about copyright are inevitably going to go by the wayside. In my view, there’s just no way that significant damages or the “destruction” of ChatGPT will survive Supreme Court review. After all, even Chief Justice John Roberts sees “great potential” in AI.

There may be future cases where someone uses AI to blatantly copy or plagiarize another person’s work for commercial gain, but most use of AI will fall into a gray area between “copying” and “learning from” that the courts will eventually consider to be fair use. Copyright law will slowly adapt to the future of tech, and so will organizations like The New York Times.

Till next time,

– Rob

Quick question: What did you think of today’s email?

PS. The new cohort of the Innovating with AI Incubator opens January 24 for everyone who's on the waitlist. Join the waitlist via email or SMS and you'll get an early-access discount.

Innovating with AI

Rob Howard

We help entrepreneurs and executives harness the power of AI.

Read more from Innovating with AI

Hey Reader, I never would have predicted this, but a huge percentage of Innovating with AI Incubator students have a background in medicine – from MDs to psychologists to medical equipment salespeople. They all face the same challenge: how do you keep your data private while interacting with AI? This is especially important for people who need HIPAA compliance (medical data privacy laws) and its international equivalents. But we all want to do as much as we possibly can to protect ourselves...

6 days ago • 1 min read

Hey Reader, Building an AI smartphone replacement is a tall order. We saw the Rabbit R1 attempt this feat – and despite lots of preorders, I'm not sure that device is going to make a dent in any market other than "people who buy lots of early-stage AI products." And at the behest of my Incubator students, I bought the AI-powered Ray-Ban Meta Smart Glasses, which are chunky and uncomfortable but do technically play music and allow me to talk to Meta's large language model with voice commands....

7 days ago • 1 min read

Hey Reader, I'm excited to share a new (free!) podcast episode – I sat down for 90 minutes with Zack Arnold, the founder of Optimize Yourself and an acclaimed film and television editor who's worked on shows like Empire, Burn Notice, Glee and Cobra Kai. Listen Now 🎙 We talked about AI, the metaphors and explanations I've discovered that make advanced machine learning easy to visualize for everybody, and what's coming for film, tech and other industries in the next year. Here's the full...

8 days ago • 1 min read
Share this post