Tuning
Apache Servers
Mohammed Kabir
Customers often ask my company to help them with their Web server
performance. The solutions vary depending on the budget, complexity,
size, and type (dynamic or static) of the Web contents. For customers
with moderate to large budgets and high traffic requirements (hundreds
of thousands to multi-million hits per day), we recommend deploying
load-balanced Web networks utilizing hardware load balancers or
even round-robin DNS. These customers often throw "fault tolerance"
into the same bucket as "performance" and therefore require hardware
and software redundancy. It always helps to have more RAM, faster
disks, and faster CPUs. However, not everyone is ready to fork out
money for new hardware. Customers often want to get more out of
what they already have instead of buying slim, stylish, single-unit
network appliances or servers.
In most cases, the Web platform is Apache on Linux, and Apache
has been installed from an RPM package or built from the source.
Almost always, the Apache server is a vanilla install and can be
customized for both performance and security. In this article, I
will share some techniques that we often use to get more juice out
of Apache.
Getting the Fat off Your Apache Server
Downloading and installing a binary version of Apache from the
Internet not only means you are stuck with someone else's idea of
what goes inside the Apache server but you also risk security issues
if the download site is not reputable. Therefore, the very first
step is to download the Apache source from http://www.apache.org
or one of its mirror sites and compile and install from the source.
Unfortunately, many people follow this advice without thinking
about what they need or do not need. A vanilla install process for
a downloaded source distribution includes:
# cd /usr/src/httpd-2.0.43
# ./configure --prefix=/usr/local/apache
# make
# make install
This gets Apache installed all right, but it also includes module
fat that you might not want. To find out what modules are built into
your Apache, run the http -l command from the bin directory
of your Apache installation. For example, if you have installed Apache
on /usr/local/apache, you can run:
# /usr/local/apache/bin/httpd -l
This will show all the modules installed in your Apache binary. The
vanilla compilation and installation of 2.0.43 source distribution
includes the following compiled in modules:
core.c
mod_access.c
mod_auth.c
mod_include.c
mod_log_config.c
mod_env.c
mod_setenvif.c
prefork.c
http_core.c
mod_mime.c
mod_status.c
mod_autoindex.c
mod_asis.c
mod_cgi.c
mod_negotiation.c
mod_dir.c
mod_imap.c
mod_actions.c
mod_userdir.c
mod_alias.c
mod_so.c
Lots of modules installed by default are likely not needed for
your site. For example, if you do not use CGI scripts but use PHP
instead, you do not need the mod_cgi module. Similarly, if you do
not use Server Side Include (SSI) you can do without the mod_include
module. To find out what a particular module does, consult the modules
documentation at:
http://httpd.apache.org/docs-2.0/mod/
# cd /usr/src/httpd-2.0.43
# ./configure --prefix=/usr/local/apache --disable-cgi \
--disable-include --disable-status \
--disable-negotiation --disable-imap --disable-userdir
# make
# make install
Here I have disabled CGI script support, SSI support, status support,
content negotiations support, internal image map support, and personal
Web directory support.
Once compiled and installed, the Apache binary (httpd) will be
smaller. I have reduced the binary size by one megabyte or so in
many cases. You can make a smaller memory footprint for Apache by
stripping off the symbols using the following command:
# strip /usr/local/apache/bin/httpd
If you use dynamically shared object (DSO) modules for loading mod_php
or mod_perl or other modules, be sure mod_so is installed as a static
module in your Apache binary. You can enable mod_so support using
the "--enable-so" option with the configure script. Using DSO modules
has a great advantage over statically linked modules. You can disable
the module by just commenting out the LoadModule line in the httpd.conf
file in your conf directory. Whenever you use a DSO module, be sure
you use the IfModule container as shown below:
<IfModule your_module>
# your module specific configuration
# goes here
</IfModule>
This container isolates your module-specific configurations within
itself, which ensures that you will not get a configuration error
when you comment out the appropriate LoadModule directive to disable
that module temporarily or permanently.
Tuning Apache Configuration
Once you have a lean and mean Apache binary installed, you need
to investigate your httpd.conf file for tuning issues that relate
to Apache configuration. I will discuss a few common issues.
Set Simultaneous Hit Limits
To begin, look at your MaxClient directive value. Depending on
your Multi-processing Module (MPM) choice (prefork, worker, etc.),
the meaning of MaxClient will change. In the prefork MPM model,
the MaxClient value represents the maximum number of Apache child
processes that will be run to service requests.
In the worker MPM model, this directive value means the maximum
number of simultaneous client connections. The default value is
150 for both prefork and worker. If you want to service more simultaneous
requests in either model, you need to increase this value. However,
if you increase it beyond 255, you will need to set the ServerLimit
directive for your MPM. For example:
<IfModule worker.c>
StartServers 10
MaxClients 300
MinSpareThreads 50
MaxSpareThreads 75
ThreadsPerChild 20
MaxRequestsPerChild 0
ServerLimit 300
</IfModule>
If you have buggy, bulky Web applications that have a tendency to
leak memory, consider using a nonzero value for MaxRequestsPerChild
directive to allow the child process or thread to be killed after
a few requests. This will give your Apache server better uptime.
Reduce Disk I/O
Say you request http://server/foo/bar/index.html page from
an Apache server. You would expect the server to read that index.html
file from the document root directory and return the contents via
the network to your Web browser and be done with it, right? Well,
Apache has to do more than that. If the document root is /usr/local/httpd/htdocs,
by default, it checks the existence of the following files:
/.htacces
/usr/.htaccess
/usr/local/.htaccess
/usr/local/httpd/.htaccess
/usr/local/httpd/htdocs/.htaccess
/usr/local/httpd/htdocs/foo/.htaccess
/usr/local/httpd/htdocs/bar/.htaccess
If it finds any of these files, Apache reads them to see whether the
request you made can be serviced. A URL that requests a single file
requires many disk accesses, which are a performance drain for high-volume
sites with many static Web pages. In such cases, the best choice is
to disable .htaccess file checking altogether. This can be done by
the following configuration:
<Directory />
AllowOverride None
</Directory>
When the above configuration is used, Apache will simply perform a
single disk I/O to read the requested static file and therefore gain
performance in high-volume access scenarios.
Similarly, if you have instructed Apache in httpd.conf not to
follow symbolic links (for good security reasons), using a configuration
such as this:
<Directory />
Options -FollowSymLinks
</Directory>
you would be paying a performance penalty for your good security practice.
Every time a file is requested, Apache will perform a system call
to ensure that it is not violating your symbolic link restriction
as stated above.
If this performance price is too high, but you still want good
security, you can decide not to use symbolic links at all
in your document tree but still enable symbolic links:
<Directory />
Options FollowSymLinks
</Directory>
You should only enable this once you have ensured there are no symbolic
links that could result in exposure to Web visitors. The best choice
is to completely remove all symbolic links and replace them with a
new directory structure free of links. To find what links you have
on your current document tree, run the following command from your
document root directory:
# find . -type l -print
This will show all the symbolic links in your Web space.
Stop Asking for Names!
Do not set HostnameLookups directive to On to find host names
of your Web visitors. This will slow down your Web server significantly
as it has to use the DNS subsystem to query for a hostname for the
IP address of the Web request.
Speak Many Languages Fluently
Using automatic content negotiation sounds wonderful, but it comes
with a performance price. If you store multiple language pages (e.g.,
index.html.en (English), index.html.fr (French), index.html.jp (Japanese),
etc.) in your document tree and allow Apache to determine the right
contents for the Web browser based on content-negotiation headers,
responses will slow down because Apache must make decisions about
which file is appropriate each time it gets a request.
If you must serve multiple language contents, the better approach
is to keep different language contents in different directories
and redirect the user to the right directory on first request. For
example, as soon as a Web visitor hits the main Web server (http://yourserver/),
Apache would detect the preferred content based on the client supplied
headers using a PHP or Perl script and switch the user to http://yourserver/fr/
for French pages, http://yourserver/jp/ for Japanese pages, etc.
In these language-specific directories, keep only a desired language
version of the contents. This way, with automatic content negotiation
disabled (with "--disable-negotiation"), the Apache server will
serve the pages from within the correct language directory and work
more efficiently.
There might be other directives that you need to tune to get more
juice out of your Apache server. The rule of thumb in configuring
Apache is to know your directives well. Read all directive documentation
and make sure you know what penalties a directive might have on
your performance or security.
Measure Your Efforts
Once you have tuned Apache by recompiling and reconfiguring it
as mentioned above, you can measure how it performs with the nifty
ab, Apache Benchmark, tool that comes with the Apache distribution.
By default, ab is installed in the bin directory. The ab tool allows
you to perform stress tests on your server. For example, you can
run it as follows:
# ./ab -n 1000 http://yourserver/
from the Apache bin directory to send 1000 requests to your Web server.
However, running this tool from the Apache server itself will skew
your results because the ab tool alone will consume a lot of your
system resources. So, for best results, always run ab from another
system in the same network. A sample, abridged result of the above
command is shown below:
Finished 1000 requests
Server Software: Apache/2.0.43
Server Hostname: localhost
Server Port: 80
Document Path: /
Document Length: 1456 bytes
Concurrency Level: 1
Time taken for tests: 1.427968 seconds
Complete requests: 1000
Failed requests: 0
Write errors: 0
Total transferred: 1722000 bytes
HTML transferred: 1456000 bytes
Requests per second: 700.30 [#/sec] (mean)
Time per request: 1.428 [ms] (mean)
Time per request: 1.428 [ms] (mean, across all concurrent requests)
Transfer rate: 1177.20 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 2.5 0 57
Processing: 0 0 5.0 0 71
Waiting: 0 0 4.2 0 71
Total: 0 0 5.6 0 71
Percentage of the requests served within a certain time (ms)
50% 0
66% 0
75% 0
80% 0
90% 0
95% 1
98% 1
99% 3
100% 71 (longest request)
You can run many simultaneous connections to simulate a particular
scenario. For example:
# ./ab -n 1000 -c 50 http://yourserver/
Here ab will run 50 simultaneous requests to complete 1000 requests
for the home page of the named Web server. You can run a few of these
tests to see how your Web server performs.
Performance tuning is not a quick-and-easy job, it takes time
to get it right. And when you do, you will find out your organization
has still more load demands and more complex requirements for you
to meet. So treat it like an ongoing process and improve it on a
regular basis. You must have a good understanding of how your server
is used by Web visitors and where the bottlenecks are. I hope this
article will get you started in the right direction.
Mohammed J. Kabir (prefers to be called Kabir) is the founder
and CEO of EVOKNOW, Inc. Kabir is a strong believer in process-centric
software development. He has spent many years in software development
for startups and large US companies such as Lucent, BlueShield,
Go America, AMI News, Outdoors.net, etc. He also served as the chief
technologist for many US companies where he managed numerous software
development projects. Kabir is also an IT author. Some of his most
recent books include: Red Hat Linux Security and Optimization
(Red Hat Press), Red Hat Linux Survival Guide (Red Hat Press),
Red Hat Linux Server 7 (Hungry Minds (IDG Worldwide)), Red
Hat Linux Server 6 (Hungry Minds), Red Hat Linux Server Administrator's
Handbook (IDG Worldwide), Apache Server 2 (Hungry Minds),
Apache Server Bible (IDG Worldwide), Apache Server Administrator's
Handbook (IDG Worldwide), The SuSE Linux Server (M&T
Books), and CGI Primer Plus for Windows (Macmillan (Waite
Group)). Kabir can be contacted at: kabir@evoknow.com.
|