Apr 19, 2023·1 min read

Reddit to Introduce Pricing Tiers for API Access amid AI Training Concerns

Reddit plans to charge companies for access to its API due to concerns over its usage for AI chatbot training. The company will offer pricing tiers designed to accommodate different businesses, offering various usage limits and rights according to each tier.

Reddit to Introduce Pricing Tiers for API Access amid AI Training Concerns

Reddit, a popular platform for social news aggregation and conversation, has announced plans to charge companies for access to its API. The decision stems from concerns about businesses utilizing the API to train large language models (LLMs), particularly AI chatbots.

The company plans to offer various pricing tiers to accommodate businesses of different sizes. Each tier will grant different usage limits and broader usage rights. Although Reddit has yet to release specific pricing details, the company's comprehensive collection of data has long been recognized as a valuable resource for AI training.

Steve Huffman, Founder and CEO of Reddit, said in an interview with The New York Times: “The Reddit corpus of data is really valuable, but we don’t need to give all of that value to some of the largest companies in the world for free.”

Demand for AI, once a niche technology, has skyrocketed in recent years, resulting in speculation that Reddit may go public soon. By capitalizing on this new revenue stream through its API, Reddit could be positioning itself for a successful initial public offering (IPO).

Reddit is not the only entity providing data for LLM training; data scrapers like Common Crawl scrape billions of web pages monthly, offering raw data to AI enterprises. Raw data, consisting of large pools of online information, differs from Reddit's content, which is primarily human-generated discussions. For AI models to become increasingly factually accurate and better emulate human behavior, they require access to both types of data.

In a study by Andy Baio and Simon Willison analyzing 12 million out of 2.3 billion images used to train the text-to-image model Stable Diffusion, they found that the model utilized images from Common Crawl. Many images scraped by Common Crawl originate from websites with user-generated content. Getty Images, a stock image service, sued Stable Diffusion creator Stability AI for alleged copyright infringement earlier this year.

Reddit's API has diverse applications beyond AI chatbot training. For instance, it is used to develop and maintain content moderation tools. To address this, Reddit plans to create dedicated moderation tools in the form of iOS and Android apps. These apps are intended to replace the need for content moderators to access the API, and features such as mod logs, rules management tools, and mod queue information will be included.

As no-code and low-code platforms like AppMaster become increasingly popular, access to data from platforms like Reddit could prove invaluable in the ongoing development of AI and machine learning models. Utilizing the AppMaster platform, users can create web, mobile, and backend applications, making the development process faster and more cost-effective for both small businesses and enterprises.

With Reddit's decision to charge for API access, companies in the AI and machine learning sectors will need to reevaluate their data sourcing strategies when it comes to training their LLMs.

Easy to start
Create something amazing

Experiment with AppMaster with free plan.
When you will be ready you can choose the proper subscription.

Get Started