Nagios monitoring Mumble servers

We’ve been using Pingdom for about a year and a half now, but with a baby on the way and the economy falling to pieces around us… I can’t really justify $10 a month to monitor 5 services, when I could be checking all of the services on our four servers for less than that. It’s time to downsize a little and be a bit smarter with our money, putting it into things that benefit the customers instead of websites with nice interfaces that make me feel all warm and fuzzy. :(

So we setup Nagios on a small VPS and we now have it monitoring all our servers, including the public instance on each of our Murmurs. We were monitoring Murmur using check_tcp, which is basically the same check Pingdom uses… unfortunately it’s really bloody noisy in the logs!

So I went on IRC and bugged pcgod for his Python Mumble-Pinger script, which implements the UDP ping-sweep used by the Mumble client’s connect dialog, and returns your ping to the server, how many users are on it, etc.

It was a hop, skip and a jump to modify it to output something useful to Nagios – I removed the timestamp and added “OK ” in front of the output – I believe this is optional because Nagios mainly goes off the return code of the script. Speaking of which, I modified the exception for the socket timeout (to indicate the server’s down) to print something like “CRITICAL – UDP Socket Timeout”, and to exit with return code 2.

A quick command definition in Nagios, and it’s working. It’s not great – there’s no support for warnings for elevated pings or anything like that… but it’s working. I’ll probably go through and write a better one and post it eventually, but right now I’m busy going through moderating all the junk from my comments… Viagra? Slimquick review? GTFO. :(

2 Responses to “Nagios monitoring Mumble servers”

  1. Gunni says:

    http://pastebin.com/qkfLLNmM here, fixed it for ya.

  2. Gunni says:

    updated with a change’able max ping: http://pastebin.com/rDW5WN4d

Leave a Reply