Enterprise
Database Backup Check
Ed Schaefer
If you administer a phalanx of enterprise, worldwide, distributed
databases, how can you verify remote server sites are performing
scheduled database backups or archives? How about executing a shell
script on a Unix client that accesses each remote database server,
checks how long it's been since the last backup completed,
and emails a warning to the proper administrators? This is child's
play if the backup start and stop times are stored in the database.
Also, it's simple enough if the backup statistics are generated
by a Unix executable -- just execute a remote shell, remsh,
and return the necessary data.
Unfortunately, for security reasons, many network firewalls no
longer allow executing the "r" commands -- remsh
(remote shell) or rcp (remote file copy) -- severely
limiting remote access. This article presents one method for avoiding
the "r" problem; however, the database vendor's access
utility must have remote server access, and the database stored
procedure language, SPL, must allow executing a subprogram.
My database is Informix, but to enhance portability, I hide database-specific
implementation details (such as the access utility) in shell variables.
(See my article "db_update: Controlling UNIX Enterprise Database
Releases" in Sys Admin, December 2002, http://samag.com/documents/s=7732/sam0212c/.)
General Design
On a single client server, I periodically execute the shell script,
buc_client.ss (Listing 1). For each remote server defined in a configuration
file, the client executes stored procedure, SP, sp_spawn_rem on
the remote database. The SP executes shell script, buc_server.ss
(Listing 2), on the remote server (see Figure 1).
The buc_server.ss script executes the subprogram oncheck, which
determines the last backup date and time. The buc_client.ss script
stores the last backup date-time and the current date-time in the
remote database, and returns control to the client that completes
the report processing. If the time since the last backup exceeds
the allowable, it sends a report to the administrators stored in
another configuration file.
Next, I will discuss selected issues of buc_client.ss and buc_client.ss:
Buc_client.ss Design Logic
For each server:
- Determine the time to wait before reporting a backup required
error.
- Execute stored procedure sp_spawn_rem, which executes the buc_server.ss
script on the remote server. Pass a PRIMARY KEY to the remote
server to store the backup data in the database.
- After the remote server completes execution, using the PRIMARY
KEY, retrieve the data from the remote server via the database's
remote_values table containing the current and backup date-times.
- If a network or other connection error occurs, email the DBA
list, and continue to the next server.
- Using the PRIMARY KEY, delete the entry in the remote remote_values
table.
- Determine the elapsed time between the backup time and the
current system time.
- If the elapsed time is greater than the time to wait, report
to the administrator's mail list.
Buc_server.ss Design Logic
On the remote server:
- Execute the Informix oncheck utility, which determines the
backup date and time.
- Determine the current date and time.
- Save, by PRIMARY key, the backup date and time and the current
date and time in the remote_values table.
- Terminate the server script returning to the client.
Configuration Objects
config_data.file
The config_data.file, located in the Informix user home directory,
contains the number of hours to wait before reporting that a backup
is required. The colon-delimited file structure is the variable
name, the number of hours to wait, and a comment, respectively:
hours_from_archive:18:number of hours to wait before reporting backup required
For administrator convenience, the wait parameter is in hours. The
script converts the hours to seconds since date arithmetic in the
shell is most easily performed from the Unix Epoch.
buc_server_list.txt file
The names of the servers to be accessed reside, one per line,
in the buc_server_list.txt file:
betty
doomsday
#babylon
Each database and server are accessed as such:
dbaccess testdb@betty
Any server commented out is ignored.
buc_mail_list.txt file
If a report is generated, it is mailx'ed to the administrators
listed, one per line, in the buc_mail_list.txt:
ed@schaefer.com
jwayne
#rrogers
Any administrator commented out is ignored.
dba_mail_list.txt file
Any network or database connection problems are mailx'ed
only to the administrators listed, one per line, in the dba_mail_list.txt:
ed@schaefer.com
#rmitchum
Again, any administrator commented out is ignored.
Database Objects
Executing the stored procedure sp_spawn_rem (Listing 3) allows
execution of the buc_server.ss script on the remote server. For
security purposes, I use a full path to the script to be executed:
EXECUTE PROCEDURE sp_spawn_rem("/home/eds/buc/buc_server.ss", "$rem_key");
remote_values table
Each remote server's database must contain the following table:
create table remote_values
(
remote_key char(20) not null,
remote_values varchar(64),
primary key(remote_key)
);
This table is the coupling between the client and the server. After
the client initiates a start message, the server gathers the backup
statistics and stores it in the remote_values column.
The primary key maintains the coupling between the primary and
server process and is removed when the server process terminates.
The primary key not only is good database design, but also allows
other client/server scripts performing other functions to use this
table.
Report Output
Example data example from remote_values looks like:
2003-04-06 06:30:01 2003-04-06 16:57:21
If the time to wait is exceeded, this is the generated report structure:
==== Backup Report for betty ==== Sun Apr 6 16:57:21 PDT 2003
Backup Date: 2003-04-06 06:30:01 Current Date: 2003-04-06 16:57:21
number of hours since archive: 10.4 Sun Apr 6 16:57:21 PDT 2003
Setting the Time Zone
In both the client and server programs, I set the time zone, TZ:
export TZ=$(grep TZ= /etc/TIMEZONE|cut -d= -f2)
This may be Solaris 7 or database specific, but if the TZ isn't
set within the buc_server.ss script, the oncheck utility uses the
client's TZ -- an obvious error. An improperly set TZ in
a worldwide network configuration may be catastrophic.
Cron Execution
By design, the client script executes from cron. The following
example runs buc_client.ss hourly starting at midnight:
0 * * * * cd /home/eds/buc; [ -x ./buc_client.ss ] && \
./buc_client.ss > /dev/null 2>&1
Script Security
A network design such as this is always a security risk. To minimize
problems such as a Trojan horse, I implement the following:
- Only effective user Informix may execute either the client
or server script.
- The sp_spawn_rem stored procedure may execute only the buc_client.ss.
- External programs buc_client.ss and oncheck execute with a
full path.
Conclusion
Originally, this client/server design was entirely for my benefit;
I am able to identify which sites aren't doing timely backups.
However, since the buc_client.ss script is distributed to all sites,
it can run on the same database server as the buc_server script.ss
script. The site DBA has only to set up his or her own local configuration
files and cron.
Finally, is this client/server design the only solution for this
backup problem? Certainly not. This is especially true if the server
process returns large amounts of data. The Expect environment might
be a better option, but requires climbing the Expect learning curve
and querying the keepers and denizens of the network.
Ed Schaefer is a frequent contributor to Sys Admin.
He is a software developer and DBA for Intel's Factory Integrated
Information Systems, FIIS, in Aloha, Oregon. Ed also hosts the UnixReview.com
monthly Shell Corner column. He can be reached at: olded@ix.netcom.com.
|