ADVERTISEMENT

Cloudflare is helping publishers by stopping AI crawlers

Till December 2024, around 20 per cent of the web has run through Cloudflare’s network, besides Internet properties like mobile apps, APIs, AI workflows and corporate networks

File picture of Matthew Prince, co-founder and CEO of Cloudflare. Picture: Getty Images

Mathures Paul
Published 04.07.25, 10:52 AM

Cloudflare, a major Internet infrastructure company, has announced that it will block known AI web crawlers by default to stop them from “accessing content without permission or compensation”.

“If the Internet is going to survive the age of AI, we need to give publishers the control they deserve and build a new economic model that works for everyone — creators, consumers, tomorrow’s AI founders, and the future of the web itself,” said Matthew Prince, co-founder and CEO of Cloudflare.

ADVERTISEMENT

Why is Cloudflare important?

Till December 2024, around 20 per cent of the web has run through Cloudflare’s network, besides Internet properties like mobile apps, APIs, AI workflows and corporate networks. It has millions of customers, among which are Fortune 500 companies.

How does it help if AI web crawlers are blocked?

Data is of paramount importance to AI systems. Companies like OpenAI, Google and Anthropic are building AI models that require data to train on. The most prized data would be that from respected publishers. Training on such data will allow AI models to be accurate with answers to user queries, and it will help generate images and videos.

Publishers and authors have pointed fingers at AI companies and have dragged them to court for using their material without payment or permission.

On the other hand, some publishers have entered licensing deals with AI companies that bring in money in lieu of allowing AI models to train on their content.

As more and more tools like AI Mode from Google and ChatGPT from OpenAI become popular, search results trained on data from different websites will show a summary when it comes to your queries. At times, links to the sources are listed on one side of the screen, but most people tend not to click on those, and that reduces the source of income for publishers.

For decades, search engines have indexed content and then shown relevant links to queries, allowing traffic to your website. This is changing.

The move from Cloudflare re-evaluates the relationship between website owners and AI companies, allowing publishers greater control over their content.

“Original content is what makes the Internet one of the greatest inventions in the last century, and it’s essential that creators continue making it. AI crawlers have been scraping content without limits. Our goal is to put the power back in the hands of creators while still helping AI companies innovate. This is about safeguarding the future of a free and vibrant Internet with a new model that works for everyone,” said Prince.

How will the new strategy be implemented?

From now on, every new web domain that signs up to Cloudflare will be asked if they want to allow AI crawlers. If you answer ‘no’, AI crawlers will not be able to go through your website and your data will not be scraped.

Think of it as something like Apple’s ‘Ask app not the track’ scheme but with Cloudflare, it’s about AI bots. Cloudflare will allow publishers to charge AI crawlers for access using a new “pay per crawl” model.

The new policy builds on a tool Cloudflare launched in September 2024 that gave publishers the ability to block AI crawlers with a single click. Now, Cloudflare is making this the default for all websites it provides services for.

Isn’t there already a way to stop crawlers?

Crawlers are expected to obey a website’s directions mentioned in a robots.txt file. It determines whether an AI bot can crawl the site but there are AI companies that have been ignoring these instructions.

Cloudflare already has a bot verification system in which AI web crawlers need to tell websites who they work for and what they want to do. But not all AI bots are honest. Cloudflare, according to MIT Technology Review, plans to use its experience dealing with coordinated denial-of-service attacks from bots to stop them.

Artificial Intelligence (AI) World Wide Web
Follow us on:
ADVERTISEMENT