OpenAI and Microsoft have been named as the defendants in yet another class action lawsuit over their alleged use of web scraping techniques to obtain supposedly private data for the use of training ChatGPT and other associated artificial intelligence models.ย
The most recent class action suit was filed on Sep. 5 in San Francisco by a law firm representing a pair of unnamed engineers.
According to a filing registered with the United States District Court, Northern District California:
โThis class action lawsuit arises from Defendantsโ unlawful and harmful conduct inย developing, marketing, and operating their AI products, including ChatGPT-3.5, ChatGPT-4.0, Dall-E, and Vall-E (the โProductsโ), which use stolen private information, including personally identifiable information, from hundreds of millions of internet users, including children of all ages, without their informed consent or knowledge.โ
The lawsuit goes on to complain that OpenAI โdoubled down on a strategy to secretly harvest massive amounts of personal data from the internetโ after restructuring in 2019.
โWithout this unprecedented theft of private and copyrighted information belonging toย real peopleโ write the plaintiffs, โthe products,โ referring to ChatGPT, DALL-E and OpenAIโs other models, โwould not be the multi-billion-dollar business they are today.โ
According to the filing, the plaintiffs are asking the courts to award damages to the plaintiffs and any members of the proposed classes โ which could conceivably include anyone whose information was allegedly scraped.
The suit also asks the courts to order the defendants to conduct โnonrestituionary disgorgementโ of profits made as a result to the alleged illegal scraping of data.
Scraping is the practice of using an automated bot, often called a “crawler,” to collect data from the internet. This most recent suit alleges that OpenAI and Microsoft knowingly engaged in โillegalโ scraping activity.
A previous class action lawsuit making nearly identical claims against OpenAI and Microsoft was filed in the same court district on June 28. Itโs unclear at this time if the court or defendants in the separate cases would consider combining the suits.
Related: US Copyright Office issues notice of inquiry on artificial intelligence
This isnโt the first time Microsoftโs been involved in a lawsuit over alleged scraping. The Redmond company issued a cease and desist order on behalf of its LinkedIn brand to data analytics company HiQ in 2019 over its admitted data scraping practices.
In that case, Microsoft and LinkedIn alleged that HiQ had violated the terms of service agreement required to log in to the LinkedIn website and thus have access to user data. Initially the circuit court ruled in favor of HiQ but, upon Microsoftโs appeals, the Supreme Court vacated the judgment.
The case was then kicked back down to the circuit court where Microsoft found itself on the winning side of the case. HiQ agreed to a settlement with Microsoft for an undisclosed amount and was ordered to cease its scraping activities.
Microsoft and OpenAI did not immediately respond to requests for comment.