Server status via SMS
Stormy weather makes me worried about my electronic equipment, especially servers. A summer storm can take out power, corrupt a HDD in the process, or in the worst case, fry absolutely everything by a strategically placed lightning strike.
But even on a quiet winter day, hard disk can quietly fill with logs, one byte at a time, finally causing a bunch of services to die (I’m looking at you, Mysql).
The point is, anything can happen and you have to check often to prevent an unplanned downtime, which can sometimes extend significantly because no-one checked1. And checking gets more tedious the more stuff you have. So this post describes a quick&dirty method, tailored to my needs, on the cheap.
A quick disclaimer: I do know about Pingdom and similar services, as well as Nagios. However, I wanted a simple lightweight script, which runs on my home server behind a firewall, and queries my public servers over HTTP. Then it sends me a text, once daily.
This is how it works:
- my home server acts as an agent and tries to contact all the public servers
- each server performs a quick and dirty on-demand self-test, responding with a message that reports the available disk space, mysql status, etc.
- python script on the home server aggregates all the responses into a concise form, suitable for texting
- and sends it, once a day.
To test the public servers, I created a simple PHP script which performs some basic tasks a healthy server should be able to do, and returns a response. This is the version 1:
A<?php /*A stands for Apache. If only "A" gets served, the web server is alive, but PHP doesn't work. */ /*print P (short for PHP OK), followed by GBs of free disk space */ echo "P[" . floor(disk_free_space("/") / 1024 / 1024 / 1024) . "G]"; /*try connecting to MySQL */ mysql_connect("server", "user", "pass") or die("db_down"); mysql_select_db("mysql") or die("db_err"); /*Print a short string ("My") through the database to prove it works.*/ $res = mysql_query("select 'My' as my from user"); $row = mysql_fetch_assoc($res); echo $row["my"]; /*if everything works ok, this should return something like AP[10G]My which simply means apache works ok, php interpreter is fine, there's 10G of free disk and mysql is fine. */ ?>
I wanted to get this summary delivered as a SMS notification. Twillio looked promising, but in the end wouldn’t take my CC, so I used this awesome hack to send free text in the form of a Google Calendar reminder.
I extended the provided Python script to also get the free disk space of my home server and wrap everything up as a single SMS. The result looks something like this:
Reminder: srv1_ok! c:1018M d:709054M srv2:AP[10G]My srv3:AP[2G]My
This was at first meant to be just a form of “infrastructure unit test”, returning true if everything checks out OK, and false otherwise. But since there was room for additional info, I added more to fill the message.
I’ve been somewhat reluctant to set up different notifications in the past, because that basically means more spam. But since I did it once at work (via e-mail), it’s been quite satisfying. An e-mail with the subject “Everything ok!” makes my day, and that’s what I get most of the time anyway.
- It’s just recently happened to me [↩]