Uptime vs. Kernel updates (was Re: [TriLUG] Prosperous New Year)

Rick DeNatale rick.denatale at gmail.com
Wed Jan 4 13:17:45 EST 2006


On 1/4/06, Mike Johnson <mike at enoch.org> wrote:

> As an aside, if you don't reboot your system until you absolutely have
> to, how do you know that when you -do- reboot it, there will be no
> problems?  Put another way, rebooting during a maintenance window
> ensures that when you have an emergency reboot, the system will return
> to a running state.

Good to "hear" your voice on the list Mike!

And this last is a good point. Limoncelli and Hogan talk about the
need for reboot testing in "The Practice of System and Network
Administration."  It's far better to solve a boot problem right after
you've made the config change which caused it, than some time later
after several other changes and a long power outage which exhausts the
UPS.

Limoncelli's anecdote about this goes back to the days when  many
places had a "one big computer."  They used to re-boot their VAX/VMS
machine at least three times during their monthly  "standalone
backup", first to demonstrate that the machine still could be
rebooted, second after making any changes to the startup configuration
which had been saved for the occasion, then how ever many were needed
to debug the startup changes, then finally in order to accomplish the
backup.

Nowadays, when systems are distributed, rebooting one server has much
less impact overall than it used to, and machines typically reboot
much faster than the old mainframes/midi-computers did.

Testing reboot regularly during maintenance helps keep your beeper
from going off in the middle of the night because a machine fails to
automatically reboot properly.

So maybe I won't fall into the camp of priding uptime so much.

--
Rick DeNatale

Visit the Project Mercury Wiki Site
http://www.mercuryspacecraft.com/



More information about the TriLUG mailing list