AI & News: Publishers’ Guide to Monetizing Content in a Changing Market
The emerging market for news content, fueled by the rapid development of artificial intelligence, is beginning to take shape, but significant challenges remain for publishers hoping to capitalize on it. A recent study tour in San Francisco, the heart of the AI ecosystem, revealed a clearer, though still uncertain, picture of the demand for news content, pricing structures, and the types of content that will command premium value. While the “Wild West” days of unchecked scraping and opaque pricing are receding, publishers must now prioritize specific investments to secure their position in this evolving landscape.
The shift comes after a period of chaotic, largely unregulated access to news content by AI developers. Previously, widespread scraping occurred without permission or clear pricing mechanisms. Still, early 2026 signals a turning point, with functioning marketplaces beginning to emerge. This transition necessitates a proactive approach from news organizations, focusing on managing access, structuring data, adapting to changing demands, and tracking content usage.
Managing the Flow: Controlling Automated Access
A fundamental imperative for news publishers is to gain control over automated access to their content. The unchecked proliferation of bots and scrapers poses a significant threat to potential revenue streams. Ana Jakimovska, Head of AI strategy for Mediahuis, highlighted the scale of the problem, stating that her company blocks 100,000 bots and scrapers on its largest sites, with minimal impact on overall traffic – a reduction of only low single-digit percentages. WAN-IFRA reports on this growing challenge.
Content Delivery Networks (CDNs) like Cloudflare, Akamai, and Fastly offer services to assist publishers manage traffic and restrict bot access. The traditional value exchange – where search engines crawled sites and, in return, sent referral traffic – is shifting. Sam Else, Cloudflare’s Senior Director of Strategic Partnership for Media, Creators and AI, noted a significant change: ten years ago, Google returned one referral for every two crawls; now, it’s one for every five. For Perplexity, that ratio is one referral for almost 155 crawls, and for Anthropic, it’s one for over 28,000 crawls as of January 2026. Cloudflare’s Radar service provides detailed data on these trends.
This shift in the internet’s dynamics underscores the need for publishers to negotiate terms for content access. People Inc.’s success in securing a licensing deal with Microsoft, facilitated by using Cloudflare to control bot access, demonstrates the potential of this approach. Emerging protocols like Really Simple Licensing (RSL) and the IAB’s Content Monetisation Protocols (CoMP) offer machine-readable licensing and payment instructions, though they don’t replace bot management services. RSL can even be integrated with CDN solutions, as noted by Fastly. The RSL Collective is actively promoting this standard.
Structuring for Value: Cataloging Content for Machine Readability
Beyond controlling access, publishers must prioritize cataloging and structuring their content for machine readability. AI labs are increasingly recognizing the value of high-quality news content, but only if it’s readily accessible in a usable format. Brooke Hartley Moy, CEO and co-founder of Infactory, emphasized that “AI progress is bottlenecked by data, not models or compute.” She added that AI builders are facing a scarcity of “licensed and annotated content,” which closely resembles the work of journalism.
The demand for structured content – delivered through APIs or JSON feeds – is growing. Companies like Infactory and Protege offer services to structure content for a fee, taking a cut of the revenue generated from licensing deals. Publishers are beginning to view themselves as serving two distinct audiences: humans and machines. While humans consume content in traditional formats, machines require streams of data. Madhav Chinnappa, formerly of the AP, BBC, and Google, and currently at the Reuters Institute, succinctly stated: “If people don’t realise that structured data is a tablestake, they are already out of the game.”
Adapting to Demand: From Training Data to Domain Specificity
The nature of demand from AI labs is also evolving. Initially, the focus was on general training data, but now there’s a growing need for specific data to fine-tune models or for domain-specific applications. This shift creates opportunities for publishers with specialized content. Factors that drive premium pricing include rarity, clear IP control, quality (broadcast-quality video commands higher returns), domain-specific expertise (finance, law, sports), and continuity of coverage.
Structuring content can increase its value by 10 to 30 times, according to Hartley Moy. Content that wasn’t originally published – such as audio, video, and images from archives – is now gaining value. The demand is also increasing for “grounding” or “inference” data – essentially, the factual basis of reporting. This trend is particularly relevant as AI companies move beyond general models and focus on lucrative verticals serving enterprise customers, such as Perplexity’s focus on professionals who require accurate information for decision-making.
Pricing Clarity: A Maturing Marketplace
Pricing for content is becoming more transparent, particularly for audio and video. Dave Davis, Chief Content Officer for Protege, noted that pricing has stabilized from a highly variable range of +/- 1000% two and a half years ago to around 50% fluctuation. The shift from general training to fine-tuning and specialized data has also driven up prices, with potential increases of 5 to 25 times. Protege typically takes a 35% cut of revenue from licensing deals and is increasingly offering 20-year licenses instead of perpetual ones.
The market is evolving into a multi-layered system, encompassing how data is used (training, fine-tuning, grounding), compensation models (pay-per-crawl, pay-per-use, long-term licensing), and a hybrid approach combining bilateral deals with collective industry action. Companies like Tollbit (pay-per-crawl) and ProRata (pay-per-use) are emerging as key players. Microsoft’s Publisher Content Marketplace (PCM) and potential similar initiatives from Amazon signal a growing commitment from tech giants to establish formal marketplaces for news content. Microsoft’s announcement regarding the expansion of its PCM highlights this trend.
Collective Action: A Path to Leverage
The Danish Press Collective Management Organisation, representing 99% of Danish news publishers, provides a model for collective action. The organization has secured licensing deals with Microsoft and ProRata, demonstrating the power of unified negotiation. Similar initiatives are gaining momentum across Europe, with the launch of Spur – the Standards for Publisher Usage Rights coalition – by The Guardian, BBC, Financial Times, and Sky News. The goal of Spur is to establish shared standards and responsible licensing frameworks. As Mediahuis’ Jakimovska succinctly put it, “We need to unite. That is where I think One can win.”
The path forward for news publishers in the age of AI requires a proactive and collaborative approach. By managing access, structuring data, adapting to evolving demands, and embracing collective action, publishers can position themselves to not only survive but thrive in this rapidly changing landscape. The emerging market presents both challenges and opportunities, and those who act decisively will be best positioned to reap the rewards.
