A new chapter is beginning in the battle over artificial intelligence and the training of AI technologies: Today, The New York Times, one of the most influential media organizations in the world, is suing OpenAI and Microsoft, which is the partnership most responsible for creating ChatGPT, a generative AI chatbot that uses large language data sets to model and create text, images and video in a humanlike manner.
The newspaper claims that "millions of articles published by The Times were used to train automated chatbots that now compete with the news outlet as a source of reliable information," according to a news story in The New York Times.
The story in the Times also says that the newspaper is the "first major American media organization to sue the companies...over copyright issues associated with its written works."
The suit, which was filed earlier today in Federal District Court in New York City, does not state monetary damages from OpenAI or Microsoft. However, it does state that, "This action seeks to hold them responsible for the billions of dollars in statutory and actual damages that they owe for the unlawful copying and use of The Times’s uniquely valuable works."
The complaint notes that The New York Times had been in talks with OpenAI and Microsoft, but the disputes have yet to be resolved. The story in the Times also said that Lindsey Held, a spokeswoman for OpenAI, had stated that the company had been “moving forward constructively” in conversations with the newspaper. However, she and OpenAI were “'surprised and disappointed' by the lawsuit."
When can companies legally use content for AI under the “fair use” doctrine?
One of the key elements at the heart of the dispute is whether OpenAI and Microsoft could legally use stories published by the New York Times under the "Fair Use" doctrine, in order to train Chatbots like ChatGPT.
In the complaint, the Times stated, "Publicly, Defendants insist that their conduct is protected as 'fair use' because their unlicensed use of copyrighted content to train GenAI models serves a new 'transformative' purpose. But there is nothing 'transformative' about using The Times’s content without payment to create products that substitute for The Times and steal audiences away from it."
What's surprising is that this is the second big story related to generative AI, big tech and publishing that broke this week:
Earlier this week, The New York Times published a different story on how Apple had been spending a lot of time and money over the last several weeks attempting to set up various deals with a number of high profile publishers for access to their material in order to model and train Apple’s generative AI systems, which could then create original content, of sorts. The NYTImes story stated that Apple could be spending as much as $50 million to license those articles from various news organizations.
One key takeaway is about seeking permission instead of operating under a "fair use" doctrine: In contrast to how OpenAI and Microsoft operated to set up ChatGPT, it appears Apple decided to work on setting up partnerships first, since it perhaps didn't believe it could use a "fair use" doctrine to take content off the internet in order to train its own generative AI models.