Thursday, July 12, 2012

Mental shelter

When one is not capable of standing the elements, one seeks shelter. This is a natural instinct that has facilitated our survival ever since we were weaklings. But, for one to arrive at an oasis, one must first grow capable of leaving the shelter and stand the elements. Such instinct extends to the mental aspect of our capability. There are small things that cater to ones mental comfort: Be it a game, be it a show, be it a sport, be it a habit, be it an article, or be it an addiction. They are all small compare to our goals that set on by our will. However, mental weakness leads to incapability of leaving such shelters.

Wednesday, June 6, 2012

Remote system upgrade (with grub and bmc-watchdog)

IPMI is a very very powerful tool for system administrators, especially those telecommuting ones. It's serial over LAN (SOL) support eliminates the need to personally sit in front of a server to do any pre-network operations, including reconfiguring the BIOS settings. However, it does require (A) an additional IP address to access the IPMI network interface from the Internet; or, when no additional IP can be allocated, (B) the access to a second server on the same LAN (not necessarily with administrator privilege). When either (A) or (B) is available, you can theoretically do anything remotely including fresh installation of an operation system (starting, for example, with a network boot and/or a remote drive).

Unfortunately, one of my recent situation allowed neither (A) nor (B). So, the first installation had no option but to be done by on-site personnel. But, once a networked system was up and running with a working grub boot manager, I could remotely install a new system on an unused (or a large enough swap) partition and test it out with the "boot once" support of grub. On a Debian based system with grub-2, this involves
  • changing the value of "GRUB_DEFAULT" in /etc/default/grub to "saved",
  • running "update-grub",
  • editing /boot/grub/grub.cfg to make an entry for the new system (if it was not discovered correctly by grub-probe),
  • running "grub-reboot" for the entry, and
  • rebooting the machine.
However, in most cases, you are bound to make some mistakes in the new system and fail to recover network contact to the server until an on-site person can hit the reset button of the machine for you.

Lucky for me, the BMC of the IPMI on the server did have a working watchdog timer. Therefore, I could setup the timer with enough time and start it before rebooting the machine. That way, if the new system worked, I could login to the server through the Internet and stopped the timer. But, if the new system got stuck, the watchdog would do a hard reset on the machine after the time ran out and returned to the original working system... no more waiting for on-site personnel. The actual command I used to setup the timer is bmc-watchdog from freeipmi:
  • bmc-watchdog -s -u 4 -p 0 -a 1 -F -P -L -S -O -i 900
One can consult the man page for the meaning of these options. Simply, this sets up 15 minutes on the timer for a hard reset, which can be checked with
  • bmc-watchdog -g
started with
  • bmc-watchdog -r
and stopped with
  • bmc-watchdog -y
(While, theoretically, one can achieve the same result with ipmitool, it did not work for me on the specific system.)