Checking Your System Logs with awk
Jose Nazario
UNIX systems are especially talkative and log considerable amounts
of data. Many administrators at first find digging through all those
logs annoying, and some abandon the practice of checking logs for
that reason. However, when system problems arise, those admins are
left wondering what occurred and why. Because there is so much data
to sift through on a regular UNIX system, efficiency must be sought
to make sense of all of this data and keep a watchful eye on your
system.
My tool of choice to solve this matter is the awk language.
Originally, I used grep in a rather wieldy shell script,
and didn't want to port it to Perl. I found that awk
provided a bit more flexibility than my sometimes convoluted shell
script, worked very well for the ordered structure of log files,
and had better regular expression handling than grep. I will
show several notification items that can be readily picked out,
and put them together in an awk script to parse log files
in a pretty quick manner.
While most administrators are familiar with grep, most
have not become so familiar with awk, instead favoring Perl.
awk has a number of advantages over grep, and even
a few over Perl.
- One awk script can run faster than a shell script that
calls grep multiple times. When processing large files,
this time difference can become noticable. awk compared
favorably for speed with Perl.
- awk can handle hex values embedded in regular ASCII
text, such as remnants of exploits against a machine. grep's
regular expressions cannot match non-printable characters. Perl
and awk are similar in this respect.
- Log files usually have regular fields, such as hostnames and
keywords. awk can match on the basis of these fields, rather
than matching any field in a line. This can allow for finer grained
responses. This field separator can be altered as needed.
- awk keeps a list of internal counters, such as the
number of records processed, or variables like the number of hits,
which can be useful for logfile summaries. While Perl can do this,
grep cannot by itself.
- Convoluted shellscripts with grep can be hard to maintain
or modify, and Perl has a steep learning curve. awk, by
contrast, can be complex without being complicated, and has a
less steep learning curve for most administrators.
These scripts will use a Red Hat Linux system on an Intel processor,
but notes will be made about IRIX 6.5, Solaris 7, and HP-UX 10.20
systems. I will use regular POSIX awk, so it should work
on all modern UNIX systems, which usually use either nawk
(from Brian Kernighan) or gawk, from the GNU project.
Also, I tend to make my logs readable by a privileged group, usually
the "wheel" group, and make sure that I am a member of
that group. This helps minimize time as the root user, thereby minimizing
the potential damage you can do. However, only root can write to
these logs under usual circumstances.
Where are the Log Files?
The first step to analyzing your system logs is to make sure you
know where they are. The /var file system is where logging
and host runtime stuff usually goes, but some systems use /var/log,
some use /var/adm, and still others use /var/log/syslog.
syslog uses the file /etc/syslog.conf to direct
the syslog mechanism. It uses services and logging levels
to determine the placement of log entries into files that come down
the /dev/log socket. I won't go into syslog mechanics
in this article. I will be using most of the log file entries specified
in the syslog configuration file.
Basic awk Programming
If you're not familiar with the awk programming model,
this will serve as a brief introduction. Please know that this isn't
a tutorial on awk programming. The resources section provides
awk references to consult for further information.
The basic awk model is very simple: a program has a begin
block, a main loop which is executed for every line, or record,
of input, and an end block. We will be constructing very basic but
quite powerful log analysis tools using awk. They will also
include some basic regular expression usage. With that, let's
begin designing and constructing our scripts. You may find that
you have to modify them just a bit; however, they worked on the
systems I tested them on and found them to suit my needs.
Failed Login Attempts
Sometimes would-be attackers aren't too bright and hope to
come right in the front door by attempting a login. They may either
have stolen passwords or be attempting to brute force their way
in by guessing a common password and user name combination. These
should stand out in any syslog and be investigated if the
threat requires it. The system usually complains pretty loudly when
this occurs, so you should definitely make note of it.
If you are using the TCP Wrappers package from Wietse Venema,
and you are doing service blocking, the included script will note
the blocked connections. This is an especially useful tool to block
TCP services in the absence of a firewall. Note that all blocked
login attempts protected by TCP Wrappers will be logged, including
ftp if it is so wrapped.
Bear in mind that these login attempts may be from legitimate
system users who are away and want to connect to the system or users
that simply mistyped their passwords. Knowing your users and their
patterns will help you understand the situation.
Failed su Attempts
Three of the system types used in the development of this article,
IRIX 6.5, Solaris 2.6 and 8, and HP-UX 10.20, all keep a record
of su attempts. This can be a handy single point in which
to have the data kept. The file has a simple format, logging the
date and time, the success or failure of the su action (using
a + or -, respectively), and the intended source and target accounts.
We have a small script, sulog.check, that can quickly scan
this file and report failed su attempts.
Linux, it appears, has no such file, but su data is kept
in the system log files. Our syslog.check script will look
there, as well.
The Mail Logs
The regular structure of Sendmail logs can also be used to filter
information in an otherwise healthy and normal mail server. Normally,
we don't want to be a point of abuse by spam sources, and we
want to keep our privacy and security on the server by disabling
some Sendmail commands. Furthermore, mail connectivity problems
can be easily found by using log-sifting routines like those outlined
below. These can all be checked for in the mail log examination
scripts.
Since version 8.9, the Sendmail daemon has come preconfigured
with anti-relaying features to help reduce the amount of spam sent
through mail servers running Sendmail. The default code used is
the SMTP error 550, defining a permanent fatal error for that message.
Network connectivity issues can also be monitored by watching
the Sendmail logs. If a message fails delivery due to a transient
problem, say the remote SMTP server is down, Sendmail will queue
it and note that it was deferred. This can help administrators track
why mail sometimes doesn't get through, helping to diagnose
if it's a problem on the other end.
Three Sendmail commands that used to be innocuous and have now
become popular with hackers are the DEBUG, VRFY, and
EXPN commands (or debug, verify, and expansion).
This allows the verification of a destination mail address or an
expansion of it if it is ambiguous. However, due to privacy concerns,
these are usually turned off in Sendmail configurations. Debug
mode was popular in Sendmail when several checks were dropped and
the use of Sendmail to provide remote administrative control of
a system was possible. Null connections (ones that have not completely
negotiated a connection with the mail server) are also worth noting
because they're indicative of a problem with one of the two
ends of a mail connection, or of a system scan that includes the
SMTP service.
The attached script, sendmail.check, looks for all of these
things in your Sendmail logs. It makes a note of the problem, prints
out the entry in the logs, and adds a statistic. The summary can
also be monitored for trends. The script is also fast, having been
tested on a Pentium II/266 machine using 104,000 lines of input
and completing in less than two seconds.
Non-Local syslog Entries
The syslog facility has the capacity to listen on a network
port (514/UDP) and receive entries from other hosts. This can be
helpful if you have many machines to operate and want to centralize
logging for evaluation. However, you may receive syslog entries
from hosts that you do not want entries from. Some devices, usually
network devices capable of using the syslog facility over
the network, come preconfigured to send syslog messages to
the broadcast address, 255.255.255.255. Depending on your logging
file system capacity and how often you rotate logs, this could lead
to a flooded file system.
One such system that, by default, receives syslog entries
from other hosts is IRIX 6.5. This can lead to misleading log files
or a denial of service by filling the logging disk device. Detecting
these entries is usually pretty easy. The format syslog entries
(simply a timestamp, the source hostname, the facility, and the
message) simlifies finding entries that do not belong to you. A
simple awk script will find and report these spurious entries:
#!/usr/bin/awk -f
BEGIN { print "Checking for non local syslog entries" }
{ if ($4 != "$HOSTNAME") print $0 }
Be sure to replace the macro $HOSTNAME with your system's
short hostname (i.e., yoda), but keep the double quotes because they're
vital to the script. If you do detect any entries that should not
appear, investigate the source or set up the syslog daemon
to not listen on a network port. If you need to log for other hosts
that you have defined, investigate firewalling the offending source
of the syslog messages.
Password File Anomalies
There are several reasons to keep an eye on your password file.
First, you should look for and lock any accounts that have no passwords.
Every account, even ones that seem harmless (i.e., lp), should
have a password. Second, only the root user should have a user ID
(uid) of zero. Some operating systems, like IRIX, have three or
four users with uid values of zero, but most UNIX systems have only
one account, root, with a uid value of zero.
Attackers, once in, may leave themselves a non-root user account
with a user ID of zero, allowing unfettered access to system privileges
without using the root account. Because of the regular structure
of the password file (using records separated by a colon) using
the awk language to parse for anomalies is straightforward.
The following small awk script will report these potential
anomalies:
#!/usr/bin/awk -f
BEGIN { FS = ":" }
{ if ($2 == "") {
print "------ empty password for account " $1 }
if (($3 ~ /^00*/) && ($1 != "root")) {
print "------ user has a uid of zero: " $1 }
}
If an account has an empty password, it should be fixed immediately
and the account locked. If a non-root account appears with a user
ID of zero (unless it was installed by the system itself, as in the
case of IRIX systems), it should be investigated.
Exploit Usage
Exploit usage is perhaps one of the better reasons to use awk
and not simply grep. Most buffer overflow exploits leave
behind telltale signs in the logs (provided the attacker didn't
erase the logs). However, these signs are usually non-printable
ASCII characters. Normal grep usage is unable to filter these
out, but awk enjoys more diverse pattern-matching capabilities.
The script also looks for calls to /bin/sh in the syslog,
which should indicate something worth investigating.
Buffer overflow exploits, which are the primary way most attackers
gain unauthorized remote entry, use what are called "NOP"
routines as padding. Because you can't know precisely where
in the memory the overflow will take you, the attacker pads the
space between the range in which they expect to land and where they
want to begin execution of their own code with empty processor operations,
or "NOP" calls. On the Intel x86 processor, these have
the hexadecimal value of 0x90.
In the very short script below, we match against the Intel x86
NOP value in hex, looking for \x90. If it is found, a note
about the line number is made and the timestamp and process name
(and number) are reported. It also looks for the regular expression
bin.sh, which is usually found as /bin/sh in exploit
code. This is to give a command shell once the exploit has succeeded
in running arbitrary code.
#!/usr/bin/awk -f
{
if ($0 ~ /\x90/) {
print "----------------- Possible buffer overflow at line "NR
print "time: "$1" "$2" "$3" process was "$5
}
if ($0 ~ /bin.sh/) {
print "------------- Possible call to /bin/sh at line "NR
print "time: "$1" "$2" "$3" process was "$5
}
}
If the script does report any possible buffer overflows, you are probably
the victim of a successful break in, and you should follow compromise
recovery procedures.
Putting It All Together
Now we will assemble all of these awk fragments into a
larger piece, useful for examining your system on a routine basis.
We will put all of the system log examination components together,
which can be a useful tool to find problematic entries. The small
scripts that are used to check the password file, sulog,
and Sendmail logs will remain on their own. We will wrap this all
together in a small shell script, which also puts the output into
context. These are available in Listings 1-5. (All listings for
this article can be downloaded from the Sys Admin Web site:
http://www.sysadminmag.com.)
We print out detailed information about the system and then perform
our log analysis using the awk scripts. Output is to stdout,
and you may wish to redirect this to a file. One of the things that
could be done is to mark up the output in basic HTML (and <pre>
tags) to make it readable using a Web browser. This is useful for
scrollback, for example, and readability.
You can also include other small programs for checking the system,
including monitoring the network interfaces for promiscuous mode
using the CERT tool cpm, checking the local NOC facility
for reported errors, and so forth. It can quickly become a powerful
monitor of complex systems and, has allowed me to detect a few problems
right away.
You will have to edit the wrapper script for your system. Included
are some commands for Linux, Solaris, IRIX, and HP-UX 10.20 for
the system output. You may also have to edit the location of your
log files or, in the case of sulog and Linux, their presence.
You will have to edit the syslog checking script to have
the value of your system's hostname, as described above when
checking for non-local syslog entries. Otherwise, every line
looks non-local.
Automating This Process
There are two ways to use this package -- either by hand at
regular intervals, as I do, or using a facility like cron
to schedule it to have a report ready for you at a specific time.
Two other pieces of software can also be used to check your logs
on a regular schedule and filter out interesting bits. The first
is logcheck from the Abacus Project. The second is "swatch",
short for "system watcher". Both of them work on the same
principle -- they watch your logs and generate alerts based
on a defined ruleset. The swatch package is based on Perl, and logcheck
is a C program, which can also perform actions based on the matches
against the logfile output.
Logcheck is not a real-time notification engine, but is run from
cron and can read the new portions of the logs and process
accordingly. Swatch, however, tails the logs actively and performs
its analysis and actions in real time. Both packages are covered
under the GNU Public License and are listed in the Resources section.
Conclusion
While at first a daunting task, checking the logs of a UNIX system
is a habit administrators must have. Text-processing tools can be
used to facilitate this process, helping you stay on top of system
events. With some fine tuning, you can produce a detailed report
of your systems using very simple recipes. Feel free to tailor these
scripts for your needs and software installations.
Resources
Doherty, Dale and Robbins, Arnold. Sed & Awk, 2nd ed.
O'Reilly & Associates. ISBN 1-56592-225-2.
awk Homepage: http://cm.bell-labs.com/cm/cs/awkbook/
TCP Wrappers: ftp://ftp.porcupine.org/pub/security/. Note
that Solaris 7 and above users should get the IPv6 version
SWATCH: http://www.stanford.edu/~atkins/swatch/
Logcheck: http://www.psionic.com/abacus/logcheck/
Jose Nazario, a Ph.D. candidate in Biochemistry in Cleveland,
OH, has been using UNIX for about ten years, approximately six of
those as an administrator of various forms of UNIX. He's also
taken to various forms of scripting to automate tasks and system
monitoring.
|