Archive for July 10th, 2010

Duncan’s growin’ up!

Saturday, July 10th, 2010

Our kid is growing up so fast. It seems like only yesterday we were passing out cigars and celebrating in shock the new life we’d brought into the world, and now he’s discovered kicking legs, that one hand can grab the other, and he’s almost figured out how to roll over.

Oh yeah, he’s discovered some pretty crazy looks and the art of giggling as well:

Duncan 2.3 Months from fwaggle on Vimeo.

The nightmare’s almost over…

Saturday, July 10th, 2010

It’s been a really rough weekend (well, starting Thursday) for MumbleDog and Sabrienix. We’ve been toiling away at fixing all the bugs in our Murmur services, and there’s finally a light at the end of the tunnel.

The problems started about a week or so ago when a nasty UTF8 exploit was released that let people crash the Murmur process – because we use real virtual hosting instead of that godawful TCAdmin script that was floating around, the unfortunate side effect would mean that if one Murmur blew up, they all would in the same location. Needless to say, we hop on security issues ASAP to make sure that doesn’t happen.

We got that fix all built out, tested it, and then placed the updated binary in place of the old one so if someone did blow up our Murmurs, they’d restart impervious to the attack. We figured that’d be better than just restarting at an arbitrary time and pissing someone off (because there’s never a good time to restart a bunch of voice servers), and it’d give us time to look at the second exploit.

The second one’s particularly nasty. It looks like it’s a bug in QT’s QSslSocket, and indeed that’s what most everyone’s billing it as – however it also looks like the bug is either fixed or mitigated by updating OpenSSL. I’m personally not clever enough to figure out where the bug is or how it’s fixed, all I know is that OpenSSL upgrades stopped the exploit that was in the wild from working (which anyone can download and point at any of the public servers and make them eat shit, so that’s not fun) so that was good enough for now until more information comes around.

After a lot of messing around with QT’s weirdness with regards SSL, we finally got it working so we restarted the servers with updated versions ready. The reason we restarted on purpose after this update is the second exploit doesn’t crash the server, it makes them go into an infinite loop, so they just sit there. Our monitoring systems would go crazy, but they wouldn’t actually restart the process because it’d still be running. :(

The downside to this was that the same upgraded OpenSSL that broke the exploit also broke public server registration. So our current servers that are up and running right now, none of them can be listed in the public server list. After a lot of messing around and hacking, we think we’ve gotten a completely bug free Murmur, but I’m running on very little sleep so we’re going to do a little more testing before we restart the servers again.

For what it’s worth, prior to these nasty bugs rearing their ugly heads, our servers have been up constantly for basically the entire year. We had a few spats of network issues here and there, but our datacenter staff have been taking the best vitamins and on strict health regimens to make sure they’re swift to react to them.

I might wait until monday or so, just to get past the “premium” gaming time, before we restart the servers. I mean yeah, it is summer, but those of us who work don’t particularly want to be in the middle of a raid or something and have our server reboot.