Monitoring
Sun Volume Manager
Andrew Kyle
Over the years, I've seen a lot of servers that have been
built and then eventually neglected. I often find that the servers
have been mirrored using "Sun Volume Manager" (or as it
was previously known, "Solstice Disksuite") and their
configuration has been set and then forgotten. These servers may
have been configured for redundancy at one stage, but over time,
disks die, configurations change, and people make mistakes. For
these reasons, it is imperative for metadevices to be actively monitored
to make sure they are doing what they are intended to do.
Here I will provide a script for monitoring volume manager setup
and briefly describe how it works. Metadevice monitoring should
go hand in hand with the general monitoring of the server that includes
monitoring logs, filesystems, and performance.
The script provided in this article (Listing 1) is a simple but
effective way to monitor the volume manager setup. It basically
looks for metadevices that are not in the "okay" state
or hot spares that are not in the "available" state. It
also does one other handy check. It looks for devices that have
been configured to be mirrors and have only one sub-mirror. This
is an excellent way to tell whether you have forgotten to attach
your mirrors.
This script does not require any special privileges, so long as
it can run metastat (which has default permissions of 755). This
means the script can be executed by any user's cron for automation.
I suggest running the script as often you feel necessary for effective
monitoring and to allow sufficient time to react to incidents that
happen. I run it every 10 minutes from a monitoring collection script
that filters out any messages that are the same and that have already
been sent that day. In doing so, I avoid having my email box cluttered
with duplicate messages.
As a further note, you must ensure that sendmail (or equivalent
MTA) is running on your server. Sendmail should always be running
in queue mode (-q), and only running in listening mode (-bd)
if absolutely necessary. Mail can have network connection problems,
or not be sent due to server load, etc. and will just be queued
on the server. However, there is no point in queuing the mail if
you aren't going to receive it within an appropriate time.
Running the sendmail daemon in queue mode will allow the queue to
be reprocessed at a configured amount of time later and sent out
as soon as possible.
Conclusion
Monitoring is a major part of a systems administrators job. It
is important that all aspects of a server that affect it's
performance, reliability and availabilty are monitored. This simple
script will effectively catch any problems with the server's metadevices
so prompt action can be taken to maintain redundancy and reliability.
Andrew received a Bachelor of Technology in Computer System
Engineering from Massey University, NZ in 1994. Since then, he's
done UNIX systems administration in Brisbane for Queensland Police
and CITEC. He has been concentrating on Solaris during the past
5 years by contracting in London, mainly for a Securities Bank and
now working for CSC in Sydney, Australia.
|