I shifted a few things around on my site lately, so I decided to check and see if I had any 404s on the local copy - I have a few 301s on the S3 copy, but it's probably better if I'm not linking to something that eventually goes away, plus I don't want to pay for thousands of requests. So a few days ago I cloned my trusty webspyder project.
It turned out my spider was broken in a couple of hilarious ways, so I spent most of the day fixing a few bugs with it, then spent this afternoon committing the changes on the off chance someone else finds it useful. Who knew that writing a spider was actually not such a trivial task?
It now mostly works, at least from what I can tell, though I still have a few issues I'd like to solve with it.