Website security has become increasingly critical, and one of the most effective and accessible options comes from Cloudflare’s Web Application Firewall (WAF), which can dramatically reduce unwanted traffic and potential security threats.

Cloudflare has established itself as one of the internet’s most important infrastructure companies, providing content delivery, DDoS protection, and security services to millions of websites. Their services sit between your web server and your visitors, analyzing and filtering traffic before it ever reaches your site’s hosting infrastructure. What makes Cloudflare particularly attractive is their generous free plan, which includes access to their WAF features. This means even individuals and small businesses can implement enterprise-grade security without spending a dime! The free plan provides more than enough functionality for most websites, including the ability to create custom firewall rules that intelligently handle different types of traffic based on sophisticated criteria.

Please Note About Enterprise Cloudflare

Here’s an ironic twist that catches many people off guard: if you’re using a managed WordPress hosting provider with an Enterprise Cloudflare plan, you might actually have less control than someone on the free tier. Popular managed hosts like Kinsta, Rocket.net, and Cloudways bundle Enterprise Cloudflare into their offerings, and typically don’t grant customers direct access to configure custom WAF rules because they manage the Cloudflare integration on your behalf. This means the sophisticated rule sets we’re about to discuss won’t be available to you unless your host specifically enables custom WAF access or you switch to managing Cloudflare directly for your website. If security customization is important to you, you may need to use Cloudflare’s free or paid plans independently rather than through a managed host’s integration.

The SKIP Rule: Letting Good Bots Through

The first rule in any well-designed firewall configuration should always allow legitimate traffic to pass through without interference. This SKIP rule specifically targets verified bot traffic that serves important purposes for your website. Search engine crawlers from Google, Bing, and other search providers need unfettered access to index your content. Monitoring services need to check your site’s uptime and performance. Marketing and analytics tools require consistent access to track your campaigns. The rule also includes a specific provision for Let’s Encrypt, the free SSL certificate provider, ensuring that automated certificate validation can complete successfully. By explicitly allowing these categories of verified bots, you ensure that essential services continue to function while your other security rules handle potentially problematic traffic.

Add this to your domain’s Security Custom Rules, click on Create Rule, give it a name such as Goot Bots Rule, then click Edit Expression, and paste in this rule expression:

(cf.verified_bot_category in {"Search Engine Crawler" "Search Engine Optimization" "Monitoring & Analytics" "Advertising & Marketing" "Page Preview" "Academic Research" "Security" "Accessibility" "Webhooks" "Feed Fetcher"}) or (http.user_agent contains "letsencrypt" and http.request.uri.path contains "acme-challenge")

For many websites, we often use ExactDN EWWW CDN, Wordfence, ManageWP with their Site24x7 site monitoring, allow Facebook’s user agents, and other common services like GTMetrix, Lighthouse, or CloudflareObservatory… so we customize this rule by also adding at the end:

or (http.user_agent contains "Wordfence") or (http.user_agent contains "ExactDN") or (http.user_agent contains "ewww") or (http.user_agent contains "ManageWP") or (http.user_agent contains "Site24x7") or (http.user_agent contains "facebookexternalhit") or (http.user_agent contains "meta-externalagent") or (http.user_agent contains "GTMetrix") or (http.user_agent contains "Lighthouse") or (http.user_agent contains "CloudflareObservatory")

You can also add known trusted user agents or your web server and other trusted IP addresses (separated by spaced) by appending multiples of either one of these lines:

or (http.user_agent contains “new-user-agent”)

or (ip.src in {1.1.1.1 8.8.8.8})

Then at the bottom under “Then take action…” choose action: Skip and check all the boxes to skip all other sections, with Place At set to First. It should look like this:

Good Bots Rule Screenshot

The Managed Challenge Rule: Questioning Suspicious Visitors

The second rule implements Cloudflare’s Managed Challenge for traffic that falls into a gray area. Rather than outright blocking these visitors, Cloudflare presents them with an automated challenge that legitimate browsers can usually pass invisibly while stopping most automated scrapers and bots. This rule casts a wide net, targeting several specific scenarios. It challenges known aggressive crawlers like Yandex, Baidu, SEMrush, and Ahrefs, which can consume significant server resources. It also flags any user agent claiming to be a bot but lacking Cloudflare’s verification, catching many impersonators. The rule extends to traffic from specific cloud hosting providers that often harbor scrapers, non-US traffic that isn’t from verified sources, and specific networks known for hosting questionable automation tools. Finally, it challenges anyone attempting to access WordPress login pages, adding an extra layer of protection against brute force attacks without completely blocking legitimate administrators.

With the default Managed Challenge expression, you can remove any known services you actively use, such as semrush, by removing the entire portion and the or, such as for semrush you would remove this line from the expression:

(http.user_agent contains “semrush”) or

Here is the default rule to paste in, and then edit as needed. Click on Create Rule, give it a name such as Managed Challenge Rule, then click Edit Expression, and paste/edit this rule expression:

(http.user_agent contains "yandex") or (http.user_agent contains "sogou") or (http.user_agent contains "semrush") or (http.user_agent contains "ahrefs") or (http.user_agent contains "baidu") or (http.user_agent contains "python-requests") or (http.user_agent contains "neevabot") or (http.user_agent contains "CF-UC") or (http.user_agent contains "sitelock") or (http.user_agent contains "crawl" and not cf.client.bot) or (http.user_agent contains "bot" and not cf.client.bot) or (http.user_agent contains "Bot" and not cf.client.bot) or (http.user_agent contains "Crawl" and not cf.client.bot) or (http.user_agent contains "spider" and not cf.client.bot) or (http.user_agent contains "mj12bot") or (http.user_agent contains "ZoominfoBot") or (http.user_agent contains "mojeek") or (ip.src.asnum in {135061 23724 4808} and http.user_agent contains "siteaudit") or (ip.src.asnum in {7224 16509 14618 8075 396982} and not cf.client.bot and not cf.verified_bot_category in {"Search Engine Crawler" "Search Engine Optimization" "Monitoring & Analytics" "Advertising & Marketing" "Page Preview" "Academic Research" "Security" "Accessibility" "Webhooks" "Feed Fetcher" "Aggregator"}) or (not ip.src.country in {"US"} and not cf.client.bot and not cf.verified_bot_category in {"Search Engine Crawler" "Search Engine Optimization" "Monitoring & Analytics" "Advertising & Marketing" "Page Preview" "Academic Research" "Security" "Accessibility" "Webhooks" "Feed Fetcher" "Aggregator"} and not http.request.uri.path contains "acme-challenge" and not http.request.uri.query contains "?fbclid" and not ip.src.asnum in {32934}) or (ip.src.asnum in {60068 9009 16247 51332 212238 131199 22298 29761 62639 206150 210277 46562 8100 3214 206092 206074 206164 213074}) or (http.request.uri.path contains "wp-login")

Then choose Action: Managed Challenge, Place At set to Custom, and choose after the Good Bots rule. It should look like this:

Managed Challenge Rule Screenshot

The Block Rule: Stopping Malicious Traffic Cold

The third rule takes no chances and immediately blocks traffic that has demonstrated malicious intent or originates from networks with poor reputations. This includes an extensive list of autonomous system numbers associated with hosting providers, VPN services, and networks frequently used for attacks, scraping, or other abusive behavior. The rule also blocks access to sensitive WordPress files that should never be accessed directly, including xmlrpc (often exploited for amplification attacks), wp-config (which contains database credentials), and wlwmanifest (a legacy Windows Live Writer file). Additionally, it blocks AI crawlers that don’t respect robots.txt and the catch-all “Other” bot category. The rule also blocks traffic from Tor exit nodes, which while important for privacy, are frequently abused for malicious purposes.

Click on Create Rule, give it a name such as Block Rule, then click Edit Expression, and paste in this rule expression:

(ip.src.asnum in {200373 198571 26496 31815 18450 398101 50673 7393 14061 205544 199610 21501 16125 51540 264649 39020 30083 35540 55293 36943 32244 6724 63949 7203 201924 30633 208046 36352 25264 32475 23033 212047 31898 210920 211252 16276 23470 136907 12876 210558 132203 61317 212238 37963 13238 2639 20473 63018 395954 19437 207990 27411 53667 27176 396507 206575 20454 51167 60781 62240 398493 206092 63023 213230 26347 20738 45102 24940 57523 8100 8560 6939 14178 46606 197540 397630 9009 11878}) or (http.request.uri.path contains "xmlrpc") or (http.request.uri.path contains "wp-config") or (http.request.uri.path contains "wlwmanifest") or (cf.verified_bot_category in {"AI Crawler" "Other"}) or (ip.src.country in {"T1"})

If this is too restrictive or breaking things, you can also change the action from Block to Managed Challenge, and see if that helps. You can also remove xmlrpc if you’re using a plugin like Jetpack or a service that still uses xmlrpc. Similar to the previous rule, it should look like this:

Implementing and Maintaining Your Rules

Once you’ve consolidated these rules into your Cloudflare dashboard, you’ll want to monitor their impact over time. Cloudflare provides excellent analytics and a log file of each request, showing how many requests each rule affects, allowing you to fine-tune your approach. You might discover that certain autonomous system numbers in your block list are causing false positives for legitimate visitors, or you may identify new threats that should be added. Security is never a set-it-and-forget-it proposition.

Cloudflare evaluates rules in order, so having your SKIP rule first ensures good traffic bypasses the more complex evaluation logic in subsequent rules. This three-rule structure also makes it much easier to understand what’s happening at a glance and to make targeted adjustments when needed.

The real beauty of this approach is its scalability. Whether you’re protecting a single website or managing dozens of client sites, you can deploy these same three rules across all of them, then customize as needed for specific situations. Some sites might need more lenient geographic restrictions, while others might require additional ASN blocks based on their specific threat landscape. By starting with this solid foundation, you’re giving every site robust protection that adapts to the evolving threat environment while ensuring legitimate visitors and essential services can always get through.

For those looking for a less complicated or bulk creation method to creating these rules across many domain names at once, be sure to use our Cloudflare WAF Rule plugin via our 5 Star Plugins brand. We plan on offering a free and premium version, with the premium version offering enhanced security and agency-focused features.

Please note that these rules were created and have been polished over many months across over 100 production websites we host and manage for our clients, some with more customizations than others, and we hope that they serve you well, as long as you properly customize them as needed. Also a shout out to Troy Glancy for inspiration on parts of these rules from his Cloudflare WAF Rules.

Cloudflare® is registered trademark of their respective owner. This plugin is an independent product and is not affiliated with, endorsed by, or sponsored by any of these entities. All trademarks are used under nominative fair use to describe compatibility and functionality only. No endorsement is implied.

Let's Connect, We Can Help

Email: [email protected]
Text/SMS: 619-404-4090