Automatically
Restart Login Services on a Remote Host
Hiu F. Ho
If you manage a server that is offsite, one of the worst things
that can happen is that you can no longer log onto it, and you have
to spend hours driving to the server in order to fix the problem.
A number of problems, ranging from hardware failures to software
problems, can keep you out of touch with your remote servers. For
software-related problems, however, it may be possible to let the
server automatically resuscitate your login service.
I live in Maryland, but I need to manage a FreeBSD Web server
in New Jersey. (It's a personal Web server that is co-located
at a Web-hosting company's data center.) For security reasons,
the only way I can log onto the server is to use Secure Shell (ssh).
The problem is, if sshd (Secure Shell daemon) has died for
some reason, I will either have to drive three hours to the data
center to restart sshd, or call someone at the data center
to press the reset button to reboot my FreeBSD box (and who knows
what damage that will do to the file system).
UNIX's login services are usually very stable and seldom
crash. However, I wrote a small Perl script to prepare for the worst
and with the help of the cron utility, I now have a server that
can restart sshd if it is killed unintentionally. This works
in most cases, except for cases of system or hardware-related problems.
The idea is simple -- a cron job is set up to run the Perl
script every few minutes, and the script checks whether sshd
is currently running. If it's not, it starts the ssh
daemon. This procedure will not solve all login problems, but it
is the least you can do to keep the login service alive without
purchasing any new hardware and services.
Setting Up
To begin, you must log in as root. Listing 1 shows the Perl script
(chk-sshd.pl) I created to start sshd whenever necessary.
There are two things to specify in the script. First, you need to
specify the name of the login process. To find out the process name
of your login daemon, start the login service (if it's not
already started), then use ps to list all the processes that
are currently running and look for the name:
$ ps ax
PID TT STAT TIME COMMAND
...
162 ?? Is 0:00.99 moused -p /dev/psm0 -t auto
202 ?? Is 0:46.00 /usr/local/sbin/sshd
238 ?? S 0:02.46 /usr/interbase/bin/gds_lock_mgr
...
Because I'm running sshd as my only login service, I copy
the command section of the sshd line (/usr/local/sbin/sshd)
and put it into my Perl script's line #7.
Next, specify the path to the login daemon that you want the script
to start in line #13 of the script. In my case, I can use the which
command to get the full path to sshd:
$ which sshd
/usr/local/sbin/sshd
Leave the rest of the script as is (unless you know what you're
doing).
The first thing the script does is to get a list of the currently
running processes (line #19). Then it examines the list to see if
sshd is currently running (line #24 - #28). The script will
start the sshd if the process doesn't exist in the list
(line #34 - #36).
After you save the script, remember to chmod and chown
the script file so it can only be read, written, and run
by root.
$ chmod 700 chk-sshd.pl
$ chown root chk-sshd.pl
After creating the script, you need to add a new cron job for root.
If a crontab file already exists, you can simply add the following
line to the file. If you don't already have a crontab file, simply
create a file and name it whatever you want (e.g., mycrontab) and
add the following line to the file:
0,5,10,15,20,25,30,35,40,45,50,55 * * * * /root/chk-sshd.pl
You may need to change the path to your Perl script at the end of
the above line. After adding the line to your crontab file, rerun
crontab with the crontab file you just modified:
$ crontab -u root mycrontab
Now you should have a diehard login service running on your server.
Testing
The simplest way to test whether it work is to kill your login
service, so in my case:
$ killall -9 sshd
then wait several minutes and check if your login service is being
restarted. (Don't do this on a remote server unless you're
certain everything is set up correctly.) In my case, (five minutes
after I killed sshd):
$ ps ax | grep sshd
54441 ?? Ss 0:00.70 /usr/local/sbin/sshd
It's alive, again!
Summary
After completing the above steps, you should have a server that
is able to resuscitate its login service after the login daemon
dies. Keep in mind that this won't solve all the login problems
you might encounter, but it's the least you can do to recover
the login service without any additional hardware or human support.
Hiu Ho is a Senior Software Engineer at Netword, and the creator
of the Netword Agent for Linux. Hiu Ho can be contacted at: hiu@netword.com.
|