OpenAI Accused of Deleting Key Data in Copyright Lawsuit

In their ongoing legal battle against OpenAI, attorneys for The New York Times and Daily News have charged the corporation with erasing possibly important case-related data. The publishers claim that OpenAI illegally used their content to train its AI algorithms. In order to enable the plaintiffs’ legal teams and hired experts to explore its datasets for their copyrighted works, OpenAI agreed earlier this fall to supply two virtual machines. The purpose of these virtual machines was to assist in determining whether publications from the publishers should be included in OpenAI’s training set. These searches have taken the teams more than 150 hours since November 1.

However, according to a document submitted to the U.S. District Court for the Southern District of New York, on November 14, OpenAI developers unintentionally deleted all of the search data on one of the virtual computers. OpenAI made an effort to recover the data, however the plaintiffs were unable to determine where their content would have been used in OpenAI’s models because the recovered information was useless due to the loss of folder structure and file names. The lawyers for the plaintiffs stressed that they don’t believe the deletion was done on purpose. However, they contend that it emphasizes the need for OpenAI to look for possibly illegal information in its own datasets.