In the age of artificial intelligence, content scraping has taken on a whole new dimension. AI companies and rogue actors deploy bots to crawl websites and harvest valuable content—often without permission. If you’re a content creator, blogger, or business running a WordPress site, this could mean your original work is being used to train large language models without your consent.
The good news? If you use Cloudflare, you have powerful tools at your disposal to stop this.
This replaces the previous straightforward toggle that appeared under “Bots” in the old dashboard.
These settings are still valid and help protect against general automation:
If you’re not seeing these at all, it may be that you’re already using the updated Security → Settings → Bot traffic path.
Many AI-related crawlers identify themselves via their user agents. You can block these manually.
To create a firewall rule:
Block AI Scrapers
.plaintextCopyEdit(http.user_agent contains "Bytespider") or
(http.user_agent contains "ChatGPT-User") or
(http.user_agent contains "OpenAI") or
(http.user_agent contains "ClaudeBot") or
(http.user_agent contains "Anthropic") or
(http.user_agent contains "Amazonbot") or
(http.user_agent contains "GPTBot") or
(http.user_agent contains "AIEngine")
You can update this list as new AI-related bots appear. Check their documentation or server logs for more identifiers.
If you want to allow only known and legitimate bots (like Googlebot or Bingbot):
This allows Google to crawl your site while keeping AI scrapers out.
Some scrapers don’t identify themselves and use cloud servers (AWS, Azure, etc.) to crawl your site.
Note: IPs can change frequently. Use with care.
robots.txt
File to Disallow Scraping (Not Foolproof)You can signal to bots that they shouldn’t scrape your site:
plaintextCopyEditUser-agent: GPTBot
Disallow: /
User-agent: ChatGPT-User
Disallow: /
User-agent: ClaudeBot
Disallow: /
User-agent: *
Disallow: /private/
However, rogue bots can and do ignore robots.txt
.
Protecting your WordPress content from AI bots is essential in today’s digital environment. While you can’t stop every scraper, Cloudflare gives you powerful tools to reduce the risk dramatically. By combining Bot Fight Mode, custom firewall rules, and good monitoring habits, you can keep your content safe, reduce server strain, and retain control over how your work is used.
Want help securing your WordPress site further?
Drop a comment or get in touch—we’re here to help.