Share this article on:
Powered by MOMENTUMMEDIA
For breaking news and daily updates,
subscribe to our newsletter.
AI bots are now blocked by default, with the option for site owners to opt in in an industry-first move.
Internet infrastructure provider Cloudflare has announced that, starting today (1 July), it will block AI crawler bots from accessing website content by default.
Site owners will be able to decide if they want to opt into allowing crawlers to access their content, while at the same time, AI companies can now state the purpose of their bots – whether they’re used for training, inference, or search.
“If the internet is going to survive the age of AI, we need to give publishers the control they deserve and build a new economic model that works for everyone – creators, consumers, tomorrow’s AI founders, and the future of the web itself,” Matthew Prince, co-founder and CEO of Cloudflare, said in a 2 July statement.
“Original content is what makes the internet one of the greatest inventions in the last century, and it’s essential that creators continue making it. AI crawlers have been scraping content without limits. Our goal is to put the power back in the hands of creators, while still helping AI companies innovate. This is about safeguarding the future of a free and vibrant internet with a new model that works for everyone.”
Roger Lynch, CEO of publishing giant Condé Nast, called the move “a critical step towards creating a fair value exchange on the internet that protects creators, supports quality journalism and holds AI companies accountable”.
“Cloudflare’s innovative approach to block AI crawlers is a game changer for publishers and sets a new standard for how content is respected online. When AI companies can no longer take anything they want for free, it opens the door to sustainable innovation built on permission and partnership,” Lynch said.
Condé Nast is just one publisher of many backing the move towards a permission-based approach for AI crawling. ADWEEK, the Associated Press, BuzzFeed, and many more came out in support of Cloudflare’s move.
Reddit co-founder and CEO Steve Huffman said: “The whole ecosystem of creators, platforms, web users and crawlers will be better when crawling is more transparent and controlled, and Cloudflare’s efforts are a step in the right direction for everyone.”
“AI companies, search engines, researchers, and anyone else crawling sites have to be who they say they are. And any platform on the web should have a say in who is taking their content for what.”
Ray Canzanese, director of Netskope Threat Labs, compared the current state of AI development to a popular video game character.
“The rapid development of AI has seen it behaving a little like Pac-Man – consuming everything in its path. We are now finally starting to see a correction, where organisations, who are realising exactly how much value lies within their data, are regaining control over it,” Canzanese told Cyber Daily.
“Whether it be publicly viewable, such as pricing intelligence on online retailer sites, copyrighted but ungated, for instance a digital newsletter, or privately held, which can include internal documentation, source code, and more, data has value to both the company that owns it and to AI systems that may scrape from and train on it.
“Defaulting to block is important, as it allows organisations to reassert their control and ownership without having to take any explicit actions, providing some breathing room while they work out a mutually beneficial relationship with AI. We hope that this is just one of many changes we will see in the coming months to help organisations reassert ownership and control over their data.”
David Hollingworth has been writing about technology for over 20 years, and has worked for a range of print and online titles in his career. He is enjoying getting to grips with cyber security, especially when it lets him talk about Lego.
Be the first to hear the latest developments in the cyber industry.