The most effective way to prevent bots from spamming your server is to drop them at the firewall. This is generally achieved using tools like Denyhosts or fail2ban, which monitor your logs, identify suspicious activity, and block the offending IP addresses before they cause harm.
Denyhosts works at the application level by adding entries to /etc/hosts.deny
, whereas fail2ban operates at the firewall level using iptables
, which makes it far more efficient.
However, on resource-constrained machines, fail2ban can still be taxing. A few years ago, we shared a demo of a lightweight log parser called banbylog, tailored specifically for our needs (SSH and WordPress activity monitoring) at a much lower resource cost. If that sounds like a good fit, feel free to check it out, but keep in mind that itβs not production-ready!
While the consensus is to parse logs and block hostile IPs at the firewall, it will not work under Cloudflare umbrella!
Why can't we flag offending IPs with Cloudflare?
Because the IPs that are "attacking" your server are not the actual offending IPs, but Clouflare machines that are proxying the request to your server. Take a look at the diagram below:
Cloudflare acts as a middleman between your server and the users. The only IP addresses visible to your firewall are from Cloudflare, not from the original user.
In fact, you should explicitly whitelist Cloudflare IP range. If you happen to block an IP that belongs to Cloudflare, legitimate users will see your site as down.
If the IP we see doesn't belong to the user making the request, what can we look for?
Every request Cloudflare sends to your server has an attached header that carries the user original IP under CF-Connecting-IP
. And that's what we can leverage to get them! Unfortunately, reading http
headers is too upstream for iptables
, thus HAProxy to the rescue!
While iptables
operates at Layer 4, HAProxy
can operate at OSI Layer 7.
Layer 7 - Application; protocols (HTTP, ...)
Layer 6 - Presentation; character encoding (ASCII, UTF8, ...)
Layer 5 - Session; stick client to server
Layer 4 - Transport; protocols (TCP, UDP, ...)
Layer 3 - Network; routing protocols (IP)
Layer 2 - Data Link; physical to network (ARP, Ethernet)
Layer 1 - Physical; cabling, Wi-Fi
The easiest way to test HAProxy configurations is to boot a Docker instance of HAProxy:
The above assumes you're using Powershell and are in the directory that contains haproxy.cfg
. Change ${pwd}
accordingly if not.
Below is a straightforward haproxy.cfg
that will put HAProxy listening on port 80
, log to stdout
so you can get an instant glimpse on what's going on, and also print the CF-Connecting-IP
header.
Let's make a test request to HAProxy.
I'm partial to Bruno, a portable and offline alternative to Postman. Download it, add the CF-Connecting-IP
header and make a POST
request to http://localhost:888/wp-login.php
to test if everything's working.
If things went as planned, your HAProxy should have printed this:
On our particular case, bots are hitting wp-login.php
, xmlrpc.php
and xmrlpc.php
(last one is a typo, but we've had more than 100k hits in the last 24h!). We also know that they're flooding the server with POST requests trying to brute-force passwords.
We now have a couple of options:
a) Now that we have the offending IPs in the log, we could change banbylog
to write them to a file, and have HAProxy deny those requests. While this would work, HAProxy would need to be constantly reloaded.
b) Or we may simply leverage HAProxy stick tables and do a rate-limiting on offending requests.
While "a" would allow us to ban the offending IP for an indefinite amount of time, "b" has less moving pieces.
frontend main
# requests that will be monitored and blocked if abused
acl is_wp_login path_end -i /wp-login.php /xmlrpc.php /xmrlpc.php
acl is_post method POST
# table than can store 100k IPs, entries expire after 1 minute
stick-table type ip size 100k expire 1m store http_req_rate(1m)
# we'll track (save to table) the original IP only if the request hits
# one of the monitored paths with a POST request
http-request track-sc0 hdr(CF-Connecting-IP) if is_wp_login is_post
# we now query the stick-table and if the IP has made more than
# 5 requests of the offending type in the last minute,
# current request is denied
http-request deny if is_wp_login is_post { sc_http_req_rate(0) gt 5 }
HAProxy has multiple deny options, tarpit, silent drop, reject or shadowban. A tarpit deny would be something like this:
Just for reference, here's the full (overly simplified) HAProxy config file that blocks requests if they hit one of the monitored URL's with more than 5 hits in less than a minute:
Re-Captcha, obviously!
While this article focus on the server side of things, the first thing you should obviously do, is to foolproof forms with Re-Captcha, which you can easily do with a plugin such as Advanced Google reCAPTCHA by WebFactory
.
EDIT: A user queried: Β«I have some WordPress installations under Cloudflare and some exposing the server directly. Can it work on both?Β»
You can. It's as simple as doing something like this:
This week, I assisted a friend in upgrading to professional TP-Link access points. I'm a strong advocate for devices that excel at a single task, and these EAP610 access points do just that. Highly recommended!
As an Amazon Associate I may earn from qualifying purchases on some links.
If you found this page helpful, please share.