Sustainable AI Is the Only Path for Publishers
By Erik Matlick

Online publishers, whether they know it or not, are facing another existential crisis, perhaps even greater than those of the past. Publishing is based on a simple formula: Content creation leads to audience aggregation, which leads to monetization.
The rise of generative artificial intelligence (AI) and the large language models that drive these tools has put all of these elements on the chopping block. Generative AI has the power to disrupt content, audience, and monetization, which forces publishers to take a close look at the technology. The biggest question publishers face is if generative AI is an opportunity, or a threat.
Cutting Costs Through AI
Some publishers are already harnessing this new technology to create content on a massive scale. Instead of a staff of expensive writers, publishers can now take press releases, hire a handful of editors, and use AI to write the content. This creates a large volume of cheap content that is then indexed by search engines and algorithms.
On the flip side, brands that relied on publishers to create and promote that content now have the power to become publishers in their own right. These brands can feed press releases and unpolished content directly into AI tools and get back a cleaner, refined product. That's one revenue stream down for publishers.
Changing User Behavior
While there are pros and cons to publishers and brands adopting AI, the bigger question is just how much consumers will make use of these new tools. There is potential that consumers turn to generative AI tools instead of search engines or websites, getting all the information they need without ever visiting the publisher page that originated the answer.
In many ways, consumer behavior has already changed, due to search engines enhancing their results pages over the past several years. Research indicates that anywhere from 25 percent to 50 percent of searches are answered on-page without a consumer ever clicking to a publisher website. Generative AI could accelerate this change. Search limits the character counts and page previews that consumers can see, while generative AI doesn't. Even factoring in the friction of incorrect or "hallucinated" answers from AI, if users can get more detailed answers without having to click around to other pages, why wouldn't they?
Trained on What?
Large language models only work as well as the data used to train, feed, and nourish the AI, which has a massive impact on the outcomes. In other words, these models need to analyze accurate content procured from high-quality publishers to work. Research indicates that a large portion of the data used to train generative AI systems like ChatGPT comes from web content extracted by CommonCrawl, a non-profit organization with a large archive of web content. These systems also likely make use of a repurposed search crawler as well.
As is the case with most proprietary technology, there's no insight into how the technology weights different data sets. Given that crawlers clearly contribute to the learning, the best path forward is likely regulating the crawlers themselves.
A Broken Value Exchange, Again
Generative AI stands to disrupt the publishing industry, but it also needs the existing publishing model to survive. AI players are reliant on a healthy publisher ecosystem that generates researched, edited, verified content. The output of an LLM is dependent on the content inputs that train it. Further, the issue of bias presents not only problems with accuracy but also commercial viability.
By forcing crawlers to identify themselves and declare their intentions on a page, publishers can decide if they want their content ingested by the algorithms. Regulators are already looking closely at this.
This could lead to a new revenue stream for publishers while also ensuring AI sustainability. Generative AI is a massive innovation, and publishers would be unwise to ignore it completely. But they should also be remunerated for contributing to the algorithms. If LLM and crawler regulation brings an opt-in mechanism, the AI players should compensate publishers for opting in and opening access to copyrighted content. Without this access, the AI will fail. Without revenue, publishers will fail. It's a clear win-win for both players that ensures a sustainable future.
Learning from the Past
While generative AI is an existential threat to publishing, it's far from the first challenge publishers have faced. Publishers are already dealing with the lost revenue from ad blocking, changes to social algorithms that affect their traffic, and the aforementioned issues with enhanced search results pages. Meanwhile, programmatic buying has made it possible for advertisers to target publisher audiences at steep discounts.
Publishers are taking all these issues seriously, but in many cases, they may have waited a bit too long before banding together to advocate for solutions. It's important that the publishing industry deals with generative AI right now, so that they can protect their traffic and revenue and content, while also forging alliances with the generative AI tools themselves. Only through transparent, consent and fair value based sustainable AI will the ecosystem survive.
The views and opinions expressed are solely those of the contributor and do not necessarily reflect the official position of the ANA or imply endorsement from the ANA.
Erik Matlick is CEO and co-founder at Bombora.