Shortcomings with the various systems we interact with at work, and my frustration at the spam problem as one of the moderation team on the Mumble forums lead me to think about HTTP blacklists, and specifically how difficult of a problem it is. Things improved quite a bit with the Mumble Forums once I signed up for API keys for several of them so that they would accept our reports. We're now back down to a comfortable level of spam attempts (around 20~30 a day), rather than the several-hundred per day we were dealing with.
But maybe there's a way I can track this sort of stuff myself right? After all, I've been logging the IP addresses involved in the infections of WordPress sites for a few weeks now, so why not automate the process somewhat? Another motivator for beginnging a new project was it's been literally years since I built anything in Flask, and I was itching to do so. So let's have a crack!
Using a spare VM I had laying around, I smashed together a quick reporting database which is at it's core a simple CRUD (without the D) application. When activity comes in, I put it in the DB and attach a score to it, and there's an expiry date that defaults to 30 days, along with a note as to why the record was entered in the first place. Then I added multi-user to it, to report who added each entry.
Next, I wrote an API for generating blocklists. The basic idea is say, we assign 20 points to an SSH failure (which in my scanning is indicated by "maximum authentication failures" log messages, so 20 points is after probably three unsuccesful logins). Then the default threshold for blacklisting is 50 points, so three strikes on that (network-wide) and you're out. These thresholds will probably need fine-tuning at some point.
Finally, I wrote some scripts to parse a couple of different logfiles and check for certain events. Initially, these just dumped out SQL, which I copied across and imported manually, but I then wrote a reporting part of the API, so that the helper can just use cURL or similar to make a REST request to add data, and now the process is reasonably automated.
I'm specifically looking for a few things, like SSH logins that are blatantly belligerent (so as to rule out as many false-positives as I can). I've also written a script to check for attempts to access commonly-vulnerable WordPress scripts, which I'll only run on hosts that have never had WordPress installed (for the best results, and ideally on domains that are very well indexed+linked to).
Plus, I'll manually log entries for any hacked hosts I come across via the form.
Will it be useful? I'm not sure, but it's been an interesting project all weekend.