Automattic, the parent company of sites like WordPress and Tumblr, is in talks to sell content on its platform to artificial intelligence companies, such as halfway and open artificial intelligence For training purposes, according to a new report 404 media Tuesday. While details of the deal remain unclear, Automattic is trying to reassure users that they can opt out at any time.
404 reported conflicts within Automattic because some of the content crawled for the artificial intelligence company included private content that the company did not intend to preserve. To further complicate matters, even advertising content that did not belong to Automattic (including ads from old Apple Music campaigns) reportedly made its way into the training data set.
Automattic’s plans were reportedly so controversial internally that one product manager even began pulling his own photos from Tumblr to ensure they weren’t used to train AI. 404.
Generative AI has become big business since OpenAI first launched ChatGPT in late 2022, followed by many companies launching text prompt image creators. The technology works by “training” large amounts of data so that it can generate raw videos, images or text.But major publishers complained Some even file lawsuitsclaiming that much of the data used to train these systems is either pirated or does not constitute “fair use” under the existing copyright regime.
Automattic plans to roll out a new setting that will allow users to opt out of training its artificial intelligence systems as early as Wednesday, 404 Media reported, but it’s unclear whether the setting will be turned on or off by default for most users. WordPress competitor Squarespace rolled out a similar setup last year, opting out of not allowing your data to be used to train artificial intelligence.
In response to questions emailed on Tuesday, Automattic directed Gizmodo to publish a new post that more or less confirms 404 Media’s reporting while trying to sell the move to consumers as “let you the opportunity to have more control over the content you create.”
“Artificial intelligence is rapidly changing nearly every aspect of our world, including the way we create and consume content. At Automattic, we’ve always believed in a free and open web and personal choice. Like other tech companies, we’re paying close attention to these advancements, including How to work with AI companies in a way that respects user preferences.” read.
But the lengthy statement comes across as incredibly defensive, noting that “there is no legal requirement for crawlers to follow these preferences” and implying that the company is simply following industry best practices in giving users the option to choose whether they want their Content is used to train AI.
“We want to provide you with tools that grant you as much control as possible, regardless of geography. Since respected companies do follow these settings, they are the best way to force content to be crawled on the web,” Automattic’s statement reads.
“Our partners will respect all opt-out settings. We also plan to go a step further and regularly update all partners with information about new opt-outs and request that their content be removed from past sources and future training.”