You are here: Home » About » Technology » Connexions Tech Blog » Roll out the server...

Roll out the server...

2004-08-26
It's been almost a week now since we rolled out some changes to the site. Now that the dust has settled a bit, I can announce what we did. Most of the changes were pretty small cosmetic-type bugs and the rest were mainly user-invisible backend changes so you'll probably never notice them unless you read about them in a blog entry.

But not this one. Right now I'm going to talk a bit about the rollout itself:

The Rollout

It actually went pretty well as far as big rollouts go. Most of the big components of the system had changed in one way or another so there were a fair number of upgrades to coordinate. For upgrades this large, Jenn (testing/QA), Ross (databases/sysadmin), and I usually draw up a detailed, step-by-step plan in advance to minimize downtime. Which we did: the web server was down only about 2 minutes. Well, it would have 2 minutes except for...

Runaway squids
We use the Squid Proxy to cache requests in front of our server and it uses separate redirector processors to rewrite the incoming URLs. Unfortunately, when we took squid down, the redirector processes didn't die. And more unfortunately, when we restarted squid the orphaned proceeses decided to start consuming system resources with reckless abandon. But once we realized what was going on we killed the offending processes and all was right with the world. Except for...
Temporary amnesia
Somehow in all of our planning we managed to omit two of the component upgrades that were scheduled to be performed: some Debian package upgrades and some Zope debug scripts. Fortunately these were relatively minor. We remembered the Zope scripts right after we put the server up and ran them. The package upgrades we remembered a bit later were finished today (I think. Right Ross?)

Who serves when you're not serving?

One of the things that did work really well about the rollout was the temporary webserver we put in place to handle requests while we were doing the upgrade. Much nicer than just refusing connections or hanging. It accepts requests for any URL on the server and returns a temporary message that lets people know the site is down and will be back up shortly. And it's written in 15 lines of python, using the SimpleHTTPServer module from the standard library and probably could have been even shorter. I love python.