WP managers lose sleep over exploits. Not sleeping is the only way they can be sure of never waking up to discover that their site has been cracked, and is now serving up malware, scraping users credentials, or part of a vast BitCoin-mining botnet. Patch everything, often.
There's also a lot of security plugins for WordPress, but I figured we ought to have something at a lower level, and my favourite first-line tool is fail2ban. You set up pattern-matching expressions, and when log files match those patterns, the system adds rules to the iptables ruleset to kick that connection.
After watching the log files and seeing the server slow down as WP tried to process hundreds of invalid requests, I figured out a rule that seemed to match most of them. My suspicion was that a lot of the WP exploit attempts used a kit, and that kit had a fairly clear signature. So along with the other handy rules in my fail2ban config, I added my rule too.
One of the outputs of fail2ban's logs is the IP address of each banned host. So I thought it might be nice to geocode them via the GeoIP database and see where they have all been coming from. "China" and "Russia" are the answers that most people seem to give when you ask them to speculate on the source of these attacks. Are they right?
So first, I took the log files that I had and extracted the IP address and timestamp of the ban. Then, using the Python GeoIP module, translated all the IP addresses to lat-long and country code. That gave me about 1200 locations from one month of retained log files.
Here's a table of the number of bans for the top few countries.
So the USA is clearly the big trouble here, with China coming in way down. Of course that's not to say all these US PCs aren't being controlled by Chinese or Russian botnets.
Now we have lat-long, we can save all this as a shapefile, and load into QGIS. Plot on an OpenStreetMap background.
First I'd like to thank Australia and New Zealand for not bothering to try and hack our server. Much appreciated. Let's look east first:
Quite a good representation here, including Iran, most of south-east Asia. I don't know why Vietnam scored so highly in the table. Let's look at Europe:
I don't know if there's any value in doing any more analysis of this particular data set, but it is at least handy to reverse some of those prejudices of people who say all the cyber attacks come from China or Russia. I've not used the timestamps of the data here, so it could be possible to create an animation of attack points from the data. If you'd like a copy of the data, get in touch.
I found another monthly tranch of fail logs. This looks very different, and we can point a finger at the Russians. Here's the top table:
I had a quick play with some of QGIS' plotting functions, and discovered that if I used an SVG symbol with a few dots on it, and used the data-driven symbology to randomly rotate it, and set the opacity to something fairly small, I could get a much better impression of the density, including where overlapping points create hotspots. There's probably a density estimation plugin somewhere for QGIS, but until then, or until I can load the data into R and do a proper kernel-density estimation, this will have to do.