An IRC network that I try and have as little to do with administering as possible has merged with another IRC server. While we originally were happy to throw away their services database (honestly, I'd throw away services completely for our purposes, but that's not my call to make) I thought it might be an interesting challenge to write a script to merge the two databases.
Having had my fill of PHP recently, I decided to choose Python as the tool for the task. I also selected regular expressions initially, but that was pretty much a case of "when all you have is a hammer" - split() would do the job admirably and likely cause far less issues down the track.
Because both networks were extremely small, neither had conflicting channels and basically the only conflicting users would be users who had already re-registered, it was simply decided that we would merge the two databases, throwing away the newer of any conflicting registrations.
I quickly threw away the regular expressions code and started again, particularly after it became rather obvious I was going to have to create objects to parse everything into memory, so that I could trivially compare things.
Comparing the registrations was rather easy once they're all in objects, but there's a few ugly hacks in there because of the way the OpenSEX databases work. There's probably a few extra ugly hacks because my understanding of Python is still rather weak. And finally, I wrote a complete and utter abomination to encode the weird Base 36 user IDs (which bears a rather striking similarity to Wikipedia's example, except I have no error handling, the digits are in a different order, and we pad the value out up to 9 As).
In the end I've chucked the whole thing up on Github, so that I can have some other people go over it and point out any silliness that I may learn from. I've still got a fair amount of work to do before I'm ready to try it out on a database I intend to use, which is hampered by the fact that OpenSEX databases don't appear to be documented anywhere. :(