Reddit might be signing a content deal to help train AI

The mammoth social media company’s user-generated content could be used to train AI models, according to insiders.

The company that calls itself the “front page of the internet” has reportedly signed a deal with an unnamed AI company to use its content as training material.

Bloomberg is reporting that “people familiar with the matter” are aware of the deal, and that Reddit has already informed investors, ahead of launching an initial public offering.

You’re out of free articles for this month

Username or Email

Password Forgot password?

Keep me signed in on this device.

First Name

Last Name

Mobile

Organisation Type

By becoming a member, I agree to receive information and promotional messages from Cyber Daily. I can opt out of these communications at any time. For more information, please visit our Privacy Statement.

Need help signing up? Visit the Help Centre.

“The San Francisco-based firm told prospective investors in its IPO that it had signed the deal, worth about $60 million on an annualised basis, earlier this year, the people said,” Bloomberg reports.

“Reddit’s agreement with an unnamed large AI company could be a model for future contracts of a similar nature, one of the people said.”

According to Bloomberg’s sources, while Reddit’s revenue last year was US$800 million – a 20 per cent year-on-year increase – the AI deal will “help Reddit tap into investors’ enthusiasm for the technology and boost its IPO”.

However, the deal appears to be subject to change and Reddit is declining to make any comment on the apparent partnership.

What’s in it for the AI?

With a huge pool of commentary and discussion on nearly every topic under the sun, this seems like a great deal for whatever AI company might be involved.

VIEW ALL

But it’s often not exactly unbiased information. Reddit’s relative anonymity fosters a culture of unfiltered expression, resulting in a plethora of biased, offensive, or outright false information. An AI trained on such data could inevitably internalise and perpetuate these biases – potentially leading to harmful outcomes.

Plus Reddit’s voting system can amplify popular opinions over factual ones. Thus, the AI might prioritise regurgitating popular but incorrect notions instead of providing accurate information.

Reddit’s also home to countless niche communities, each with its own jargon, memes, and cultural nuances. Training on such diverse data might result in an AI that struggles to comprehend or communicate effectively outside of Reddit’s ecosystem.

Any AI trained on Reddit content could well be a master of casual, online conversation and an array of trending topics, or it could just lead to the worst of the site’s bad habits getting rolled into one very fractured personality.

It’s certainly a big deal, we’re just not sure if it’s a good one.

David Hollingworth

David Hollingworth has been writing about technology for over 20 years, and has worked for a range of print and online titles in his career. He is enjoying getting to grips with cyber security, especially when it lets him talk about Lego.

You need to be a member to post comments. Become a member for free today!

newsletter

Be the first to hear the latest developments in the cyber industry.

Reddit might be signing a content deal to help train AI

David Hollingworth

OUR PLATFORMS AND BRANDS

EVENTS AND SUMMITS

PODCASTS

LEARNING AND EDUCATION

MOMENTUM MARKETS NETWORK

LINKS

STAY CONNECTED