Reddit Signs $60 Million Deal to Scrape Your Online Community for AI Parts: Report

Reddit reportedly signed a $60 million deal with a “large AI company” to allow its online communities to be scraped for AI training data, according to Bloomberg on Friday. The unnamed AI company will sift through millions of posts on Reddit, and train a large language model on Reddit’s threads.

Reddit Knowingly Downvoting Self | Future Tech

Reddit is reportedly weighing an IPO with a $5 billion valuation, despite only bringing in $800 million in revenue last year. Reddit is not profitable but has a rich valuation because its online communities offer a perfect training ground for AI models. However, licensing out your user base’s thoughts and ideas is not always reciprocated well. The most popular subreddits went dark in protest last year after users took issue with the company charging for access to its application programming interface (API), first announced in April of 2023.

Reddit’s reported deal with an “unnamed large AI company” is exactly what the platform has been looking for. Big Tech is hungry for data, and that has turned legacy news organizations, community forums, and even the University of Michigan into mere content farms. These deals, though upsetting to users, offer Reddit a path to profitability.

“The Reddit corpus of data is really valuable,” said Reddit CEO Steve Huffman to The New York Times in April. “But we don’t need to give all of that value to some of the largest companies in the world for free.”

But when Reddit started charging for API access, it didn’t just charge big companies, it also started charging small, independent researchers. This shift made it more difficult for Reddit’s moderators to manage their communities, and some argued it made for a worse experience for Reddit’s 800 million monthly active users.

“We believe that the longevity and success of this platform rest on preserving the rich ecosystem that has developed around it,” said Reddit moderators in a collective letter from last June. “The potential loss of these services due to the pricing change would significantly impact our ability to moderate efficiently, thus negatively affecting the experience for users in our communities.”

Reddit did not immediately respond to Gizmodo’s request for comment.

Apple was exploring $50 million AI deals with The New York Times, Condé Nast, and other news publishers in December. Shutterstock is also licensing its human-made content to OpenAI for training on its models. Twitter, Instagram, and YouTube have also become increasingly valuable in recent years, as they’re now seen as content gold mines.

The platform also introduced ads in recent years and made it impossible for users to opt out of seeing advertiser content in 2023. As Reddit becomes a public company, there’s a growing concern from users that management will hurt the thriving community forum it has built.

There’s also a bigger concern about how AI companies are licensing data. Content platforms are signing million-dollar licensing agreements with AI companies, but the actual people who created this content aren’t getting a thing. Meanwhile, AI threatens to replace content creators in the editorial, graphic design, and film industries.

via Gizmodo

February 20, 2024 at 09:45AM