Unifying
Web Clusters with Spread
John David Duncan
The most cost-effective way to scale up a Web application is to
move it from a single Web server to a cluster of inexpensive servers.
However, moving an application from one server to many can pose
some challenges for systems administrators. It might be easy to
analyze log files or monitor traffic on a single server, for example,
but it's not nearly as simple to gather complete and meaningful
information from multiple log files scattered across many servers.
The standard syslog utility, useful for distributing many other
sorts of logging operations, turns out to be less than ideal for
managing Web logs due to a variety of design decisions, including
its use of an unreliable network protocol (UDP) for transporting
data.
The Spread Toolkit
While searching for an open source solution to this problem, I
found Spread, a network group messaging toolkit developed at Johns
Hopkins University that is available from http://www.spread.org.
Spread is a mature project, now five years old, and was originally
licensed under other terms but has been available under an open
source license since June 2001. With Spread, I can develop distributed
logging and monitoring applications running on many Web servers
just as easily as single-server applications.
Spread itself is a multi-purpose networking toolkit, usable for
many applications besides the one I describe here. It provides a
simple API allowing an application to join a message group, to send
a message to a message group, or to receive the messages sent to
a group by other members. Whenever an application sends a message,
the API passes it to a spread daemon running on the local machine,
and the daemon distributes it to other the machines in the network.
The underlying transport can vary depending on the network topology
of a segment; Spread commonly uses a combination of UDP multicast
within a LAN and unicast between LANs. The Spread protocol can provide
"reliable ordered multicast" -- semantics that guarantee an application
that every group member has received its messages, and received
them in the correct order. If that guarantee is not needed, however,
the application can ask Spread to send a message without it and
obtain better performance.
Spread APIs are available for C, C++, Java, Perl, PHP, Ruby, Python,
and a few other languages. Spread has been used to provide database
replication in Postgres and replicated storage for Zope. The Backhand
project (http://www.backhand.org) provides two useful applications
built on top of Spread. One is a utility called Wackamole, used
to manage a virtual IP address for high availability, moving it
from one server to another if the first server fails. The other
is an Apache module named mod_log_spread.
Mod_log_spread was the first Spread application to really catch
my eye. It allows a Web server to write its access log to a Spread
group rather than a local file. Any number of Web servers can send
log messages using mod_log_spread; somewhere in the network, another
machine runs a logging daemon, spreadlogd, which simply joins the
groups and writes all of the messages into a single log file. For
example, if your site needed to keep duplicate logs for redundancy,
you could simply add a second log server, turn on spreadlogd, and
start writing another copy. The module is actually distributed as
a patch to Apache's mod_log_config. An alternative implementation
by the same developer, George Schlossnagle, for mod_perl users is
the Apache::Log::Spread module, available from CPAN.
At my site, various Web applications write their own specific
log files. The site is developed in Perl, so I built an object-oriented
layer on top of Spread's Perl API to provide a simple logging facility.
The application programmers modified each Perl script to use this
Log::SpreadLog module, and now the spreadlogd daemon retrieves each
log message and writes it to its appropriate file on a central logging
server.
Since each application sends messages to a particular Spread group,
a sys admin can monitor any aspect of the site by joining the appropriate
group and watching the messages flow by. For more systematic monitoring,
the messages can be automatically counted and graphed. A graph of
Spread messages can be used to measure the activity of a whole Web
cluster in real time.
Configuring Spread
Spread is available from http://www.spread.org, and at
the time of writing, version 3.17.1 is soon to be released. Version
3.17 is the first major release to use GNU autoconf and can be built
on most UNIX machines using just "configure" and "make". Binaries
are also available for other platforms, including Windows NT and
2000. A default "make install" will install the Spread binary in
/usr/local/sbin, the Spread header files in /usr/local/include,
and both the thread-safe Spread library libtspread and the unsafe
library libspread in /usr/local/lib. (Through version 3.16,
these two libraries were known as "libtsp" and "libsp". The new
naming scheme is an improvement, but the change can cause a linker
error when building applications against Spread 3.17 that were written
using older versions. This problem is likely to affect any source
code dated before September 2002.)
Spread is configured using a file named spread.conf, usually
found in /etc or /usr/local/etc. Here is a sample
spread.conf file for three machines on a private network.
The machines communicate using the local multicast address 239.0.0.1
and Spread's default port, 4803 (the could alternatively have used
a broadcast address like 192.168.1.255). RFC 2365 designates the
multicast addresses from 239.0.0.0 to 239.255.255.255 for private
use inside a single organization, so it is usually good to assign
a Spread segment an address within that range.
Spread_Segment 239.0.0.1:4803 {
larry 192.168.1.1
moe 192.168.1.2
curly 192.168.1.3
}
Once you have created the spread.conf file, you can start the
spread daemon on each machine. To deploy Spread for production use,
start the daemon automatically at boot time, and consider running
it with enhanced priority (using the nice command) or even
with real-time priority. Real-time priority is available on FreeBSD
machines by running spread from the rtprio command. Under Linux,
use the Scheduler module, available at http://www.omniti.com/~george/mod_log_spread/.
This allows the Spread networking code to run closer to kernel-level
priority -- where networking code usually runs -- rather than user-level,
and helps keep a Spread segment stable even if a particular machine
gets bogged down by cpu-intensive processes.
Once the daemon is running, you can test the network using spuser,
Spread's command-line client. Start spuser running on two
different machines, and use the "j" command to join a group, followed
by the "s" command to send a message. If this all works, it looks
something like the following example:
larry% spuser
User> j testgroup
User> s testgroup
enter message: Come here, Watson. I need you.
curly% spuser
User> j testgroup
User>
============================
received SAFE message from #user#larry, of type 1, (endian 0) to 1 groups
(32 bytes): Come here, Watson. I need you.
Configuring Spreadlogd
Spreadlogd can be found in the mod_log_spread package at:
http://www.backhand.org
It requires its own configuration file -- /etc/spreadlogd.conf
-- specifying how to connect to Spread, which groups to listen to,
and where to write each group's messages. Here is a sample /etc/spreadlogd.conf
file:
Spread {
Port = 4803
Log {
Group = "group1"
File = /var/log/spread/test1.log
}
Log {
Group = "group2"
File = /var/log/spread/apache.log
}
}
Once you have installed spreadlogd, created the /var/log/spread
directory, and edited /etc/spreadlogd.conf, it should be possible
to run and test spreadlogd. To test the configuration shown here,
you could use spuser on any machine in the segment to send
a message to the group "group1". The message should show up in the
file /var/log/spread/test1.log on the spreadlogd server.
If you also run Apache and decide to try mod_log_spread, the package
includes a file called INSTALL with instructions for building the
module and inserting it into the Apache server.
Using the Perl API
Spread's Perl API is implemented by the Spread.pm module. The
module is included with the Spread source in the directory perl/Spread.
I built it using the typical procedure for building a Perl module
-- perl Makefile.pl ; make ; make install.
Spread's Perl API is quite similar to its C API, and is well documented
in Jonathan Stanton's "Users' Guide to Spread," available from http://www.spread.org/docs/guide/
in PDF and PostScript formats. The Log::SpreadLog package (Listing
1) is intended to hide the details of Spread's operation and allow
an application to send a message using only three lines of code.
The programmer initializes a Log::SpreadLog object using the name
of a Spread group, and then sends a message to the group by calling
the object's write() method. Internally, Log::SpreadLog calls Spread::Connect
to establish an mbox (a Spread object similar to a socket) and Spread::multicast
to send the message. The object does not provide any facilities
for receiving messages, but it allows a distributed Web application
to send the messages that will get logged by spreadlogd. At my site,
we log a variety of information this way, including ad impressions,
cookies, and errant search queries:
use Log::SpreadLog;
$log=new Log::SpreadLog("mygroup");
$log->write("this is a message");
Counting and Graphing
The spcount utility (at http://www.efn.org/~jdd/projects/spread)
is designed to join a Spread group, count the messages, and output
a periodic total. Counting the messages has several uses. For some
groups, a total of zero messages in a given time period is a sure
sign of failure, and can be used to generate an alert. At my site,
we send a Spread message every time the server delivers an ad impression,
and we know how many ads are on a page, so we count the number of
ads served and use that number to determine the current rate of
page views per minute in real time. It is more accurate to count
ads, which cannot be cached, than to count the pages themselves,
which are cacheable and therefore not necessarily served by one
of our back-end servers. We create real-time graphs of site activity
by running spcount as a daemon and directing its output to a file.
Counting messages in five-minute intervals creates a file suitable
for input into the graphing program MRTG. We also count in one-minute
intervals, and display the output with a custom CGI script similar
to the one included in the spcount distribution. A sample plot is
shown in Figure 1.
Conclusion
Having a simple way to send a message to a named group, record
a group's messages in log files with spreadlogd, eavesdrop
on them with spuser, and count them using spcount
has made my cluster of Web servers much more manageable. Spread
allows me to measure and assess the group of servers as a single
entity. It is supported by a diverse group of people on the spread-users
mailing list at lists.spread.org, each using Spread for something
slightly different from the next.
John David Duncan is a freelance Web programmer, DBA, and UNIX
sys admin in San Francisco. He's currently the staff systems administrator
at GreatSchools.net, and on the side he manages the election night
database for a major news Web site. He can be contacted at: jdd@efn.org.
|