Autosniff
-- A Sniffer-Starting Daemon
Ed Ravin
A network sniffer, a program that captures data on a network connection,
is a wonderful tool for solving network problems. The only thing
wrong with a sniffer is that someone has to start it before the
problem happens, and stop it after the problem has been reproduced.
This isn't an issue if the person with the network problem
(i.e., the "customer") is in the same office with you
or can call you at your desk when they're ready to test with
you. In a support environment, however, your customers might be
on the other side of the world, working while you're asleep
or out of the office. Or, they might be using dial-up connections
that have a different IP address every time they call, making it
hard to focus on their traffic until they're actually dialed
in.
One way to capture the network activity from these situations
is to start the sniffer and have it capture an entire class of traffic
(e.g., all mail traffic). Then, when you hear from the customer
that they've reproduced the problem, stop the sniffer, and
sift through the trace file looking for the customer's traffic.
This procedure is fraught with danger -- for example, you
might capture too much data and run out of disk space, interfering
with other programs on the sniffer machine. When you do find the
customer's data, you might have ten packets showing the problem
and ten thousand packets that are the customer's normal traffic,
with no easy way to tell which is which. And then, there's
the issue of privacy -- unless you can zoom in on one customer's
traffic, and only the traffic that constitutes their network problem,
you will end up sifting through data that you don't want to
see, like your customer's passwords, mail, and other private
information.
This article describes a set of shell scripts that will let your
customer start the sniffer, record a trace of the problem, and automatically
mail you the results when the data capture is complete.
Autosniff listens on your network for a "trigger event",
which is a packet that is unlikely to appear in normal use. After
the trigger event occurs, Autosniff starts the sniffer for a specific
time interval, listening for packets from the host that sent the
trigger (presumably the customer). When the time interval is over,
Autosniff stops the trace and sends email to remind you to pick
up the trace file.
Generating the trigger event presents a bit of a challenge: it
needs to be something simple enough that it can be generated by
a non-technical user, on any operating system, and with simple instructions
from the sys admin. However, the trigger event also must be reasonably
unique, something that doesn't usually appear on the network.
I chose to have the customer open a connection to an unlikely port
number on the destination host as the trigger event.
For example, suppose I am a network administrator at the "example.com"
company and my customer is having trouble using the server "mail.example.com".
Let's further suppose the customer is using a Windows or Mac
program that supplies cryptic error messages that are of little
or no help in diagnosing the problem. I would rather have a network
trace to work with, so after starting Autosniff, I give the customer
these instructions:
1. At the command line, type in "telnet mail.example.com
64000", or give your Web browser the URL "http://mail.example.com:64000".
2. You'll see an error message of some kind. Just ignore
it.
3. Reproduce the problem with your mail program.
4. For the next five minutes, refrain from doing anything that
might contact mail.example.com, so we don't accidentally capture
normal mail or other irrelevant information.
Every user that has either a Web browser or a telnet client should
be able to generate the trigger event. And unless mail.example.com
is getting a full portscan every day, chances are that no one else
will try to open up a connection to that host on port 64000. Nevertheless,
the port number must be chosen very carefully -- make sure that
you have no services running on that port and that it is not commonly
scanned by those random probes that every Internet host receives
on a daily basis.
Autosniff was designed to be sys admin friendly. Here's how
the "example.com" sys admin might invoke it for a mail
server problem:
# autosniff
Enter a short alphanumeric name for this job: JohnDoe
Enter an (optional) description: user can't authenticate to sendmail
Hostname that customer will use to trigger autosniff: mail.example.com
Port number to use for trigger: 64000
Capture filter to use once triggered [ip]: port 25
Number of minutes to run sniffer after trigger [5]:
Mail address for notifications [root]: helpdesk@example.com
Some of Autosniff's questions have default answers (shown inside
the square brackets); hit a blank return to use the default. The job
name and description are for identifying this particular job: you
might have multiple Autosniff jobs running at the same time, and you'll
need to be able to tell them apart. This information will be contained
in all emails sent by Autosniff.
As described previously, you'll need to pick a trigger hostname
and port number. Usually, you'll use the same host that the
customer will be connecting to and an oddball port number that is
not in use. If you have another Autosniff job waiting to be triggered,
don't use the same port number again, because the first customer
who calls in will trigger both jobs simultaneously.
You also need to pick a capture filter for focusing on the customer's
traffic after the trigger event occurs. The default filter is "ip",
that is, all IP traffic to and from the IP address that sends the
trigger event. In this example (because we're trying to solve
a Sendmail problem), we limit the traffic to port 25, the port that
Sendmail listens on.
After you've typed in the trigger information and the capture
filter, Autosniff will perform a "sanity check" to catch
any syntax errors or bad hostnames. If an error is found, you will
be asked to re-enter all the network-related information. Finally,
you will be asked how long to run the sniffer once it is triggered
and to which email address the notifications should be sent.
Once Autosniff is running, it will send you an email to let you
know it is waiting for a trigger event. The email will also contain
a reminder of what information to send the customer. Note that there
is no timeout when waiting for a trigger -- Autosniff will keep
running until you manually kill it or reboot the machine.
After Autosniff is finished, it will send another email describing
what happened (i.e., whether any data was captured) and the name
of the file that has the results (the packet capture file).
Installing Autosniff
Autosniff should run "out of the box" on any Solaris
or Linux host, and with only minor tweaking on other UNIX platforms
that have tcpdump installed. You can download Autosniff from:
http://www.samag.com/code/
After you've unpacked the Autosniff tarball, you'll find
three files (see Listings 1-3):
autosniff
autosniffd
autosniff.conf
To install, put "autosniff" and "autosniffd" somewhere
in your path, typically /usr/local/sbin. Then, copy "autosniff.conf"
to the /etc directory and customize it as needed for your local site.
There are only a few items at the top of the file to worry about,
all marked with the comment "###CUSTOMIZE###" in the file:
- The DEFAULTMAIL variable -- The default choice for where
to send Autosniff's notifications.
- The AUTOSNIFFD variable -- This contains the path to autosniffd.
If you installed both autosniff and autosniffd in the /usr/local/sbin
directory, then you can leave this alone.
- The IPARG variable -- If autosniff complains of a "trigger
IP address failure", it is probably because the output of
tcpdump varies between versions, and autosniff was unable to find
the field in tcpdump's output that has the source IP address.
To fix this, run this quick test, which will sniff one IP packet:
# tcpdump -n -c 1 ip
and note which space-delimited field in the tcpdump output contains
the source IP address. In the versions of tcpdump I've tested,
it's either been in the fourth space-delimited field (on
Linux), or the second (on NetBSD).
- The TCPDUMP_OVERRIDE variable -- Solaris users who want
to use tcpdump (rather than snoop, the sniffer program that is
supplied with Solaris) should set this to the path of the tcpdump
command. You might want to do this if you prefer your packet capture
files to be in the pcap (tcpdump) format instead of the Solaris
snoop format.
Which Host Should Run Autosniff?
When you run Autosniff, it should always be on a host that will
be in a position to capture the traffic between the customer and
the host(s) with which the customer is communicating. If the problem
is limited to one UNIX machine, then the easiest method is to run
Autosniff on that machine.
If the problem affects multiple machines, or if you're not
sure which machine the customer will be communicating with, you
can still use Autosniff. However, you must ensure that the machine
on which Autosniff runs can "see" all the machines on
the network that you are looking for. Before switched networks became
common, this was never an issue -- all traffic on any particular
LAN segment was visible to every computer on the segment. With modern
switches, however, a host only sees traffic that is specifically
destined for that host. Most middle and high-end network switches
support a "monitor port", where you program the switch
to "mirror" packets from other ports (or perhaps all ports)
onto a particular port. You will need to be intimately familiar
with your network gear to take full advantage of this feature.
How Autosniff Works
Autosniff has two parts, both shell scripts -- a user interface
program and a shell script daemon. The user interface prompts you
for the information needed to start the capture, double-checks your
input, and then starts the daemon with the "nohup" command
so it will continue to run after you've logged out.
The daemon first runs tcpdump (or snoop if you're on a Solaris
host) with a filter expression for the trigger event (like "host
mail.example.com and port 64000") and the "-c 1"
option, which tells tcpdump to only capture one packet and then
exit. When tcpdump exits, Autosniff tries to read the capture file
left behind. If all is well, it will have in it one packet with
the source IP address currently in use by your customer.
Then, armed with the customer's IP address, Autosniff starts
tcpdump again, with a new filter expression formed of the customer's
IP address and the filter that you specified for focusing on the
customer's problem. Autosniff puts the tcpdump job in the background
and then sleeps for five minutes (or whatever timeout you've
given it). When the sleep is over, Autosniff kills the tcpdump,
makes sure there's some data in the sniff file, and mails you
the results.
Conclusion
At my shop, Autosniff has taken much of the pain out of using
a sniffer to diagnose customers' networking problems. The simple
interface makes it possible for most of our support staff to start
the sniffer and later call in a more experienced technician for
diagnosing the packet capture file.
References
Ethereal is a high-quality sniffer and packet display program
-- you'll find it useful for decoding files captured by
Autosniff: http://www.ethereal.com.
tcpdump is supplied with many operating systems (such as Linux
and *BSD systems), but you should make sure you have the latest
version because of the occasional security bug: http://www.tcpdump.org.
Ed Ravin has been helping computers talk to each other for
the past 15 years or so. Currently, he is a systems administrator
at Panix, a small but friendly Internet service provider in New
York City that caters to shell users and other technically savvy
customers. Ed is also the co-author of Using and Managing UUCP,
published by O'Reilly & Associates. He can be reached at:
eravin@panix.com.
|