Stress
Testing Jabber with the Jabber Test Suite
Dustin Puryear
Jabber is an instant messaging service whose time has come. The
Jabber software relies on an open standard, which in turn depends
on XML. This allows developers and service providers to use Jabber
for any number of reasons. A common use of Jabber (http://www.jabber.org)
is as an Instant Messaging (IM) application that connects fellow
Jabber users with each other as well as with users of other IM services.
Yet, Jabber can do so much more when serving as a gateway between
custom services. (For an introduction to Jabber, see "Instant
Messaging with Jabber," by Chris Josephes, in January 2002
Sys Admin.)
As the use of Jabber increases, the need for a testing harness
has increased as well. When deploying Jabber, administrators and
developers alike need to know how well the Jabber service scales
as traffic and message size increases. I developed the Jabber Test
Suite (JabberTest) for just this reason and have since expanded
the software to evaluate a number of criteria, including delivery
rates, out-of-order delivery, and the total number of connections
a server can handle concurrently.
In this article, I will show how to install and run JabberTest.
I will also work through several test scenarios where JabberTest
is used to determine how well your Jabber service is working under
a given load.
Installing the Suite
JabberTest requires a few things. To begin, you need a working
C compiler installed on your UNIX system along with expat (http://expat.sf.net),
an XML parser library written in C. You may want to download and
install the expat library using your vendor's customized packages
(i.e., an RPM). Doing so is fine, but for completeness I will install
expat from source. (I will be using a Red Hat 7.0 workstation for
the examples in this article.)
To download expat, go to the project's SourceForge page located
at http://www.sf.net/projects/expat and download the latest
expat package. I will use expat 1.95.6, but later versions should
work fine with JabberTest. Once you have downloaded the source code,
untar the tarball, move into the source directory, run configure,
and then install the library:
# tar xvfz expat-1.95.6.tar.gz
# cd expat-1.95.6
# ./configure
# make install
We will be using the shared expat libraries. By default, expat will
install to /usr/local, so check that /usr/local/lib is listed in /etc/ld.so.conf
and then run ldconfig:
# ldconfig
Next, download and compile JabberTest. Like expat, JabberTest is hosted
on SourceForge. Download the source code at http://jabbertest.sf.net.
I will be using version 1.0.2, but later versions will work with the
examples provided here:
# tar xvfz jabber-testsuite-1.0.2.tgz
# cd testsuite
# cp Makefile.linux Makefile
# make
Notice that you need to choose the makefile for your given UNIX flavor.
I tested JabberTest under Linux, FreeBSD, SunOS, Solaris, and even
Cygwin. You can probably compile JabberTest under other systems as
well, but you may need to tweak the makefile to do it.
Once these steps are completed, JabberTest should be compiled
and ready to go. Note that there is no make install step.
The binaries will reside in the testsuite/ directory, but you can
move them to another location.
Registering Your Test Accounts
The next step is to learn how each of the tools within the suite
work. The first tool I will examine is userreg, which registers
a group of test accounts with the Jabber service. When using userreg,
and indeed all of the JabberTest tools, you must specify the hostname
used by the Jabber service. In my example, I will be using jabber.example.com,
but this choice will vary depending on where you have the server
installed:
[testsuite]$ ./userreg -h jabber.example.com -u 4
0 1047251364.889784 1047251365.252948 0.363164
1 1047251365.258430 1047251365.595498 0.337068
2 1047251365.598393 1047251365.900254 0.301861
3 1047251365.908584 1047251369.180991 3.272407
In this example, userreg will connect to jabber.example.com and then
register four test accounts named test_0, test_1, test_2, and test_3.
Notice the output of userreg, which consists of four columns: iteration,
registration start, and stop time (in the form of seconds.milliseconds),
and the difference between the start and stop registration time.
When using userreg, focus on the diff column. When a Jabber server
is heavily loaded this number will increase, and registrations that
take more than a few seconds, at most, are generally too long. If
this is the case, then you will need to research the cause of the
slow registration process.
Common causes of slow registrations include an overloaded server
or a server using file-based user accounts instead of a DBMS back-end.
The problem with using a file-based user account scheme with Jabber
is that directory searches tend to slow down as the number of files
in a directory increase. So if you are serving more than 10,000
or so accounts, you could experience delays.
Testing for Connection Limits
Another useful tool in the suite is pasvlogin, which logs in a
given number of users with Jabber. Connections to Jabber are silent
in that they do not produce any messages destined for a recipient.
Rather, pasvlogin will connect using a given user account and then
disconnect. You can specify the number of connections to make, and
whether you wish pasvlogin to maintain the connections until the
program is killed.
There are two main focuses behind pasvlogin. First, pasvlogin
can help determine the maximum theoretical connections that may
be maintained by your Jabber server. For example, if you want to
see whether your Linux-based Jabber server can maintain more than
1K connections (an old problem for administrators running Jabber
under Linux), then run pasvlogin with the following parameters:
[testsuite]$ ./pasvlogin -h jabber.example.com -u 1500
0 1047251491.814272 1047251492.699332 0.885060
1 1047251492.708603 1047251492.988734 0.280131
2 1047251492.998453 1047251493.301929 0.303476
3 1047251493.308586 1047251493.633857 0.325271
...
Of course, if you are running pasvlogin on a Linux machine, then your
testing client may hit the 1K-connection limit, so consider running
pasvlogin concurrently on multiple machines. To do this properly,
you must ensure that each copy of pasvlogin uses a unique set of accounts.
This is done using the -n and -u options, which specify
the first user account to log in and the total number of accounts
to log in, respectively. For example, to specify that accounts test_5,
test_6, and test_7 are used, use the following parameters:
[testsuite]$ ./pasvlogin -h jabber.example.com -n 5 -u 3
When using pasvlogin, you may also want to vary the speed with which
attempts to establish sessions with the Jabber service are made. This
is especially important if you have karma enabled (as is the default
with the Jabber.org server). To specify the rate at which connections
are made, use the -x and -y parameters, which specify
a throttle rate in seconds and microseconds, respectively. For example,
to ensure that one connection is made once per second and to maintain
those connections until cancelled by the user, try the following:
[testsuite]$ ./pasvlogin -h jabber.example.com -u 2 -w -x 1
0 1047251610.753983 1047251611.044539 0.290556
1 1047251612.058311 1047251612.345279 0.286968
Enter quit to close sessions and quit.
Using the -x and -y parameters allows you to determine
the best values for your karma based on the typical usage patterns
of your clients. In addition, by varying these values you can determine
the connection rate at which your Jabber server becomes swamped.
Finding Jabber Server Trouble Spots
The real power tools of the Jabber Test Suite are msgloadsnd and
msgloadrec. These tools can help send a specified number of messages
per second through your Jabber service.
The msgloadsnd tool sends the messages found in the file messages.txt
(this can be overridden using the -z parameter). The messages
file allows you to customize what messages to send so that you can
test customized Jabber components that may act on the contents of
the message. Currently, msgloadsnd will randomly select which messages
to send from the file, but in future versions you will be able to
go through the list of messages in the file linearly.
msgloadrec receives messages sent by msgloadsnd, and will use
key values implanted by msgloadsnd in the message body to print
the number of messages received, the order in which each message
is received, and the time it took to receive the message. This information
can help determine whether you are losing messages, whether messages
are delivered out-of-order, and how long Jabber is taking to actually
deliver the message to the recipient.
Begin by running a very simple load test against our Jabber server.
The msgloadrec and msgloadsnd pair uses a set of users, one to send
messages and one to receive. Begin by loading msgloadrec so that
you have a Jabber user logged in and ready to receive messages.
Do this by specifying the user account that will receive messages
(known as the to-user) using the -t parameter:
[testsuite]$ ./msgloadrec -h jabber.example.com -t test_1 -d
Also notice the use of the -d parameter, which specifies that
msgloadrec not only receive messages but also parse special timing
data in the message created by msgloadsnd so that delivery order and
times can be printed.
Next, launch msgloadsnd from another shell or another machine
entirely. Specify both the sending (user-from) and receiving (user-to)
user to msgloadsnd, using the -f and -t parameters,
respectively. In addition, specify the total number of messages
to send using -m, as shown below:
[testsuite] $ ./msgloadsnd -h jabber.example.com -f test_0 \
-t test_1 -m 256
In this example, msgloadsnd will send a total of 256 messages through
Jabber from the account "test_0@jabber.example.com" to "test_1@jabber.example.com".
Unless you have karma disabled, you may get kicked off the service
by Jabber because you are sending messages through the service as
fast as you can. (Using this kind of test with multiple msgloadrec
and msgloadsnd clients will definitely help you tweak the server operating
system parameters and Jabber so that high-load situations are well
handled.)
While the test is running, you will see output from msgloadrec
similar to the following:
0 0 1046405663.848253 1046405664.094517 0.246264
1 1 1046405663.857878 1046405664.094517 0.236639
2 2 1046405663.867901 1046405664.098372 0.230471
3 3 1046405663.877830 1046405664.098372 0.220542
The output here is very similar to the output of pasvlogin. The five
columns are as follows:
- Delivery count, which starts at zero when msgloadrec is first
loaded
- Message ID
- When the message was sent
- When the message was received (in the form of seconds.milliseconds)
- The difference between the send time and receive time
Consider the uses of this data in real-world situations. First,
consider the delivery count column, which is the first column in
the output. msgloadrec maintains an internal counter of the number
of messages received since the process was launched. Second, imagine
that you have sent four messages through a heavily loaded Jabber
service, but the output of msgloadrec shows only:
0 0 1046405663.848253 1046405664.094517 0.246264
1 1 1046405663.857878 1046405664.094517 0.236639
2 2 1046405663.867901 1046405664.098372 0.230471
In this case, you know that Jabber has lost the message somewhere.
If you need to guarantee message delivery at all times, then this
is an excellent test to perform while the server is under a high load.
This is a good way to catch bugs in custom transports or other customizations
of the Jabber server that behave incorrectly during high loads or
low memory conditions.
Another possibility is that the output is actually as follows:
0 0 1046405663.848250 1046405664.094510 0.246260
1 2 1046405663.867900 1046405664.094510 0.226610
2 3 1046405663.877830 1046405664.098370 0.220540
3 1 1046405663.857870 1046405664.098370 0.240500
Notice that the message ID column is no longer in ascending numerical
order. The message ID is used to determine when a message was sent.
Because the numbers are out of order, we know that msgloadrec received
the actual messages out of order. In this case, messages 0, 2, 3,
and then 1 arrived in that order even though they were sent out in
the order of 0, 1, 2, and 3. For some applications out-of-order delivery
may be acceptable, but for others that may not be the case.
Finally, the last three columns are used when you need to know
message delivery times. The last field is a simple difference between
the message-sent and message-received fields. These values can be
used to determine the minimum, maximum, and average delivery times
for your Jabber server. You may want to load many msgloadrec and
msgloadsnd pairs on multiple machines to hammer against a Jabber
service to see how well the service scales against the load. This
is an opportune time to monitor performance of the server using
tools such as sar, vmstat, and iostat. Delivery time is an excellent
indicator of the success of tuning hardware to run large-scale Jabber
services.
When actually tuning your Jabber server, another useful parameter
is -l, which specifies that msgloadsnd send a steady stream
of messages through Jabber until msgloadsnd is killed (typically
with Control-C). When used in conjunction with -x or -y,
which specify how many messages to send every second or usec, respectively,
this lets you fine-tune how many messages to send through Jabber
at any given time:
[testsuite]$ ./msgloadsnd -h jabber.org -f test_0 -t \
test_1 -l -y 100000
Here we are sending a constant flow of messages to user "test_1@jabber.example.com"
at the rate of 1 message every 1/10 of a second. Of course, the rate
at which you specify msgloadsnd to send a message may not actually
be the rate at which messages are sent. Your process may be preempted
often enough that it can't even send messages that frequently.
Thus, the only way to actually check how often messages are being
sent is to review the message send time column in the msgloadrec output
(indeed, that is one of the main reasons why those columns exist).
Learning More About Your Jabber Service
In this article, I have described the Jabber Test Suite and shown
how the suite can be used to learn more about your own Jabber service.
The goal of the suite is to allow administrators access to the kind
of information required to tune their servers, and to ensure any
client requirements are maintained (e.g., delivery order). There
are other parameters available for use with the tools that are worth
exploring. Additionally, you can use the tools in various combinations
to create new ways to test your service, and you can further fine-tune
the messages to be sent by modifying the messages file used by msgloadsnd.
Feel free to email me with any questions, comments, bug reports,
or feature suggestions.
Dustin Puryear is author of Integrate Linux Solutions into
Your Windows Network (Premier Press), has written numerous technical
articles, is a conference speaker, and continues to push the envelope
in UNIX and Windows integration as principle consultant of Puryear
Information Technology. He can be contacted at: dustin@puryear-it.com.
|