Article
Listing 1
Listing 2
Listing 3
Listing 4
Listing 5

sep2001.tar

Checking Your System Logs with awk

Jose Nazario

UNIX systems are especially talkative and log considerable amounts of data. Many administrators at first find digging through all those logs annoying, and some abandon the practice of checking logs for that reason. However, when system problems arise, those admins are left wondering what occurred and why. Because there is so much data to sift through on a regular UNIX system, efficiency must be sought to make sense of all of this data and keep a watchful eye on your system.

My tool of choice to solve this matter is the awk language. Originally, I used grep in a rather wieldy shell script, and didn't want to port it to Perl. I found that awk provided a bit more flexibility than my sometimes convoluted shell script, worked very well for the ordered structure of log files, and had better regular expression handling than grep. I will show several notification items that can be readily picked out, and put them together in an awk script to parse log files in a pretty quick manner.

While most administrators are familiar with grep, most have not become so familiar with awk, instead favoring Perl. awk has a number of advantages over grep, and even a few over Perl.

One awk script can run faster than a shell script that calls grep multiple times. When processing large files, this time difference can become noticable. awk compared favorably for speed with Perl.
awk can handle hex values embedded in regular ASCII text, such as remnants of exploits against a machine. grep's regular expressions cannot match non-printable characters. Perl and awk are similar in this respect.
Log files usually have regular fields, such as hostnames and keywords. awk can match on the basis of these fields, rather than matching any field in a line. This can allow for finer grained responses. This field separator can be altered as needed.
awk keeps a list of internal counters, such as the number of records processed, or variables like the number of hits, which can be useful for logfile summaries. While Perl can do this, grep cannot by itself.
Convoluted shellscripts with grep can be hard to maintain or modify, and Perl has a steep learning curve. awk, by contrast, can be complex without being complicated, and has a less steep learning curve for most administrators.

These scripts will use a Red Hat Linux system on an Intel processor, but notes will be made about IRIX 6.5, Solaris 7, and HP-UX 10.20 systems. I will use regular POSIX awk, so it should work on all modern UNIX systems, which usually use either nawk (from Brian Kernighan) or gawk, from the GNU project.

Also, I tend to make my logs readable by a privileged group, usually the "wheel" group, and make sure that I am a member of that group. This helps minimize time as the root user, thereby minimizing the potential damage you can do. However, only root can write to these logs under usual circumstances.

Where are the Log Files?

The first step to analyzing your system logs is to make sure you know where they are. The /var file system is where logging and host runtime stuff usually goes, but some systems use /var/log, some use /var/adm, and still others use /var/log/syslog.

syslog uses the file /etc/syslog.conf to direct the syslog mechanism. It uses services and logging levels to determine the placement of log entries into files that come down the /dev/log socket. I won't go into syslog mechanics in this article. I will be using most of the log file entries specified in the syslog configuration file.

Basic awk Programming

If you're not familiar with the awk programming model, this will serve as a brief introduction. Please know that this isn't a tutorial on awk programming. The resources section provides awk references to consult for further information.

The basic awk model is very simple: a program has a begin block, a main loop which is executed for every line, or record, of input, and an end block. We will be constructing very basic but quite powerful log analysis tools using awk. They will also include some basic regular expression usage. With that, let's begin designing and constructing our scripts. You may find that you have to modify them just a bit; however, they worked on the systems I tested them on and found them to suit my needs.

Failed Login Attempts

Sometimes would-be attackers aren't too bright and hope to come right in the front door by attempting a login. They may either have stolen passwords or be attempting to brute force their way in by guessing a common password and user name combination. These should stand out in any syslog and be investigated if the threat requires it. The system usually complains pretty loudly when this occurs, so you should definitely make note of it.

If you are using the TCP Wrappers package from Wietse Venema, and you are doing service blocking, the included script will note the blocked connections. This is an especially useful tool to block TCP services in the absence of a firewall. Note that all blocked login attempts protected by TCP Wrappers will be logged, including ftp if it is so wrapped.

Bear in mind that these login attempts may be from legitimate system users who are away and want to connect to the system or users that simply mistyped their passwords. Knowing your users and their patterns will help you understand the situation.

Failed su Attempts

Three of the system types used in the development of this article, IRIX 6.5, Solaris 2.6 and 8, and HP-UX 10.20, all keep a record of su attempts. This can be a handy single point in which to have the data kept. The file has a simple format, logging the date and time, the success or failure of the su action (using a + or -, respectively), and the intended source and target accounts. We have a small script, sulog.check, that can quickly scan this file and report failed su attempts.

Linux, it appears, has no such file, but su data is kept in the system log files. Our syslog.check script will look there, as well.

The Mail Logs

The regular structure of Sendmail logs can also be used to filter information in an otherwise healthy and normal mail server. Normally, we don't want to be a point of abuse by spam sources, and we want to keep our privacy and security on the server by disabling some Sendmail commands. Furthermore, mail connectivity problems can be easily found by using log-sifting routines like those outlined below. These can all be checked for in the mail log examination scripts.

Since version 8.9, the Sendmail daemon has come preconfigured with anti-relaying features to help reduce the amount of spam sent through mail servers running Sendmail. The default code used is the SMTP error 550, defining a permanent fatal error for that message.

Network connectivity issues can also be monitored by watching the Sendmail logs. If a message fails delivery due to a transient problem, say the remote SMTP server is down, Sendmail will queue it and note that it was deferred. This can help administrators track why mail sometimes doesn't get through, helping to diagnose if it's a problem on the other end.

Three Sendmail commands that used to be innocuous and have now become popular with hackers are the DEBUG, VRFY, and EXPN commands (or debug, verify, and expansion). This allows the verification of a destination mail address or an expansion of it if it is ambiguous. However, due to privacy concerns, these are usually turned off in Sendmail configurations. Debug mode was popular in Sendmail when several checks were dropped and the use of Sendmail to provide remote administrative control of a system was possible. Null connections (ones that have not completely negotiated a connection with the mail server) are also worth noting because they're indicative of a problem with one of the two ends of a mail connection, or of a system scan that includes the SMTP service.

The attached script, sendmail.check, looks for all of these things in your Sendmail logs. It makes a note of the problem, prints out the entry in the logs, and adds a statistic. The summary can also be monitored for trends. The script is also fast, having been tested on a Pentium II/266 machine using 104,000 lines of input and completing in less than two seconds.

Non-Local syslog Entries

The syslog facility has the capacity to listen on a network port (514/UDP) and receive entries from other hosts. This can be helpful if you have many machines to operate and want to centralize logging for evaluation. However, you may receive syslog entries from hosts that you do not want entries from. Some devices, usually network devices capable of using the syslog facility over the network, come preconfigured to send syslog messages to the broadcast address, 255.255.255.255. Depending on your logging file system capacity and how often you rotate logs, this could lead to a flooded file system.

One such system that, by default, receives syslog entries from other hosts is IRIX 6.5. This can lead to misleading log files or a denial of service by filling the logging disk device. Detecting these entries is usually pretty easy. The format syslog entries (simply a timestamp, the source hostname, the facility, and the message) simlifies finding entries that do not belong to you. A simple awk script will find and report these spurious entries:

#!/usr/bin/awk -f
BEGIN { print "Checking for non local syslog entries" }
{ if ($4 != "$HOSTNAME") print $0 }

Be sure to replace the macro $HOSTNAME with your system's short hostname (i.e., yoda), but keep the double quotes because they're vital to the script. If you do detect any entries that should not appear, investigate the source or set up the syslog daemon to not listen on a network port. If you need to log for other hosts that you have defined, investigate firewalling the offending source of the syslog messages.

Password File Anomalies

There are several reasons to keep an eye on your password file. First, you should look for and lock any accounts that have no passwords. Every account, even ones that seem harmless (i.e., lp), should have a password. Second, only the root user should have a user ID (uid) of zero. Some operating systems, like IRIX, have three or four users with uid values of zero, but most UNIX systems have only one account, root, with a uid value of zero.

Attackers, once in, may leave themselves a non-root user account with a user ID of zero, allowing unfettered access to system privileges without using the root account. Because of the regular structure of the password file (using records separated by a colon) using the awk language to parse for anomalies is straightforward. The following small awk script will report these potential anomalies:

#!/usr/bin/awk -f

BEGIN { FS = ":" }

{ if ($2 == "") {
    print "------ empty password for account " $1 }
  if (($3 ~ /^00*/) && ($1 != "root")) {
    print "------ user has a uid of zero: " $1 }
}

If an account has an empty password, it should be fixed immediately and the account locked. If a non-root account appears with a user ID of zero (unless it was installed by the system itself, as in the case of IRIX systems), it should be investigated.

Exploit Usage

Exploit usage is perhaps one of the better reasons to use awk and not simply grep. Most buffer overflow exploits leave behind telltale signs in the logs (provided the attacker didn't erase the logs). However, these signs are usually non-printable ASCII characters. Normal grep usage is unable to filter these out, but awk enjoys more diverse pattern-matching capabilities. The script also looks for calls to /bin/sh in the syslog, which should indicate something worth investigating.

Buffer overflow exploits, which are the primary way most attackers gain unauthorized remote entry, use what are called "NOP" routines as padding. Because you can't know precisely where in the memory the overflow will take you, the attacker pads the space between the range in which they expect to land and where they want to begin execution of their own code with empty processor operations, or "NOP" calls. On the Intel x86 processor, these have the hexadecimal value of 0x90.

In the very short script below, we match against the Intel x86 NOP value in hex, looking for \x90. If it is found, a note about the line number is made and the timestamp and process name (and number) are reported. It also looks for the regular expression bin.sh, which is usually found as /bin/sh in exploit code. This is to give a command shell once the exploit has succeeded in running arbitrary code.

#!/usr/bin/awk -f
{
if ($0 ~ /\x90/) {
  print "----------------- Possible buffer overflow at line "NR
  print "time: "$1" "$2" "$3" process was "$5
  }
if ($0 ~ /bin.sh/) {
  print "------------- Possible call to /bin/sh at line "NR
  print "time: "$1" "$2" "$3" process was "$5
  }
}

If the script does report any possible buffer overflows, you are probably the victim of a successful break in, and you should follow compromise recovery procedures.

Putting It All Together

Now we will assemble all of these awk fragments into a larger piece, useful for examining your system on a routine basis. We will put all of the system log examination components together, which can be a useful tool to find problematic entries. The small scripts that are used to check the password file, sulog, and Sendmail logs will remain on their own. We will wrap this all together in a small shell script, which also puts the output into context. These are available in Listings 1-5. (All listings for this article can be downloaded from the Sys Admin Web site: http://www.sysadminmag.com.)

We print out detailed information about the system and then perform our log analysis using the awk scripts. Output is to stdout, and you may wish to redirect this to a file. One of the things that could be done is to mark up the output in basic HTML (and <pre> tags) to make it readable using a Web browser. This is useful for scrollback, for example, and readability.

You can also include other small programs for checking the system, including monitoring the network interfaces for promiscuous mode using the CERT tool cpm, checking the local NOC facility for reported errors, and so forth. It can quickly become a powerful monitor of complex systems and, has allowed me to detect a few problems right away.

You will have to edit the wrapper script for your system. Included are some commands for Linux, Solaris, IRIX, and HP-UX 10.20 for the system output. You may also have to edit the location of your log files or, in the case of sulog and Linux, their presence. You will have to edit the syslog checking script to have the value of your system's hostname, as described above when checking for non-local syslog entries. Otherwise, every line looks non-local.

Automating This Process

There are two ways to use this package -- either by hand at regular intervals, as I do, or using a facility like cron to schedule it to have a report ready for you at a specific time.

Two other pieces of software can also be used to check your logs on a regular schedule and filter out interesting bits. The first is logcheck from the Abacus Project. The second is "swatch", short for "system watcher". Both of them work on the same principle -- they watch your logs and generate alerts based on a defined ruleset. The swatch package is based on Perl, and logcheck is a C program, which can also perform actions based on the matches against the logfile output.

Logcheck is not a real-time notification engine, but is run from cron and can read the new portions of the logs and process accordingly. Swatch, however, tails the logs actively and performs its analysis and actions in real time. Both packages are covered under the GNU Public License and are listed in the Resources section.

Conclusion

While at first a daunting task, checking the logs of a UNIX system is a habit administrators must have. Text-processing tools can be used to facilitate this process, helping you stay on top of system events. With some fine tuning, you can produce a detailed report of your systems using very simple recipes. Feel free to tailor these scripts for your needs and software installations.

Resources

Doherty, Dale and Robbins, Arnold. Sed & Awk, 2nd ed. O'Reilly & Associates. ISBN 1-56592-225-2.

awk Homepage: http://cm.bell-labs.com/cm/cs/awkbook/

TCP Wrappers: ftp://ftp.porcupine.org/pub/security/. Note that Solaris 7 and above users should get the IPv6 version

SWATCH: http://www.stanford.edu/~atkins/swatch/

Logcheck: http://www.psionic.com/abacus/logcheck/

Jose Nazario, a Ph.D. candidate in Biochemistry in Cleveland, OH, has been using UNIX for about ten years, approximately six of those as an administrator of various forms of UNIX. He's also taken to various forms of scripting to automate tasks and system monitoring.