Industrial Strength Cluster Security for an Open Source Price
Neil Gorsuch
Computer security is big business. Worldwide annual revenue of
the VPN/Firewall market was $2.7 billion in 2002 (Source: Infonetics
Research, Inc.). Upfront licensing costs for proprietary industrial
strength cluster security solutions can range from thousands to
hundreds of thousands of dollars depending on cluster size. However,
use and deployment of open source solutions can reduce the total
cost of ownership. This paper describes the deployment of an industrial
strength open source firewall solution based on an easily configurable
packet-filtering compiler system for clusters.
Stateful packet-filtering firewalls can provide excellent security
from network attacks, but are difficult at best to set up and maintain.
When packet filtering is combined with packet forwarding, NAT'ing,
and pseudo-interfaces, a single machine can provide firewall protection
for a private network of machines. Thus, protected machines have
complete access to the general networks and visibility at general
network addresses, while maintaining their firewall protection.
A configurable packet-filtering compiler system for clusters can
provide all these benefits.
Introduction
Total cost of ownership (TCO) is often overlooked in designing
and deploying large-scale computing solutions. A potentially significant
percentage can be added to the TCO when security costs, both initial
and ongoing, are factored in. Industrial strength cluster security
solutions can entail upfront licensing costs that quickly multiply
when proprietary security solutions are licensed on a per-node basis
(16x, 32x, 64x, 128x...). Using and deploying open source solutions
can lower the TCO by eliminating the software licensing cost component
of the equation.
The NCSA Cluster and Software Tools Group co-located at the University
of Illinois at Urbana-Champaign is a pioneer in the area of cluster
design and deployment. A founding member of Open Cluster Group (http://www.openclustergroup.org)
and the Gelato Federation (http://www.gelato.org), the group
is working in partnership with other institutions to develop software
(OSCAR and GOLD) that greatly simplifies the task of installing
and running parallel Linux clusters that are compatible with large-scale
production systems. A key component of their work is cluster security.
This article describes their work on an industrial strength open
source firewall solution based on a configurable packet-filtering
compiler system.
Security Layers
For effective cluster security, multiple security layers should
be provided. Cluster network security layers should ideally consist
of: router filtering, network stack protections, IP masquerading
and NAT'ing, packet filtering, disabling unused network services,
TCPwrappers, and configuring applications.
Packet filtering is the process of examining each network packet
as it comes into or through a device, and either allowing, dropping,
or rejecting the packet based on various factors: source and destination
addresses, port numbers, and whether the packet initiates a new
connection attempt. Packet filtering can block all incoming network
connection attempts except those explicitly allowed. This prevents
computer configuration mistakes such as allowing an insecure network
service to be accessed by mistake.
Stateful packet filtering keeps track of each network communication
sequence of packets and provides for simpler, more secure firewalls
that are easier to set up and maintain.
All cluster machines that receive packets from outside the cluster
should have their own packet filtering installed, or a common firewall
machine should filter all packets entering and leaving the cluster.
It is difficult to use packet filtering to set up a firewall on
a computer system -- comparable to writing programs in assembly
language. Systems administrators need a method to set up and maintain
good packet-filtering firewalls without having to learn the intricacies
of packet-filtering commands.
Packet Re-Writing
Linux and other operating systems can modify network packets as
they pass through a system. This ultra-sophisticated type of NAT
can yield some very useful results. Since this process is controlled
through the packet-filtering rules, a packet-filtering compiler
must control it.
Host Aliasing
Consider a firewall machine with multiple network interfaces that
is acting as an IP-masquerading firewall for one of the interfaces.
In some cases, it is advantageous to access one or more of the machines
that are on the private network through a public address. However,
if the desired access method for more than one of the machines is
through the same type of service, those machines must still appear
as if they were also on the public network, so only outside connections
to the desired services are passed on to the private address of
the machines. This allows one firewall machine to act as a gateway
for several hidden machines, and all firewalling and packet-filtering
protection can be set up on one machine.
Packet Redirection or Forwarding
Sometimes packets must be rewritten and sent to a different destination
than intended (e.g., having all outgoing packets destined for Web
servers redirected to a proxy Web server on a local machine).
Packet-Filtering Details
Linux kernels use "chains" of packet-filtering rules
to filter packets. Each filtering rule has a set of matching conditions
and a set action if the packet is matched. Packets can be matched
by source address, destination address, source port, destination
port, TCP flags, type, MAC address, length, throttling limits, specific
byte patterns, and user id of locally generated packets. Some chains
are predefined and always exist; others can be defined or deleted
by the user. Rules can be added to any chain, and any chain can
have all its rules flushed from it. Packets can traverse more than
one chain.
Linux version 2.4 and later kernels support integrated connection
tracking and stateful inspection. This adds more ways to match packets,
based on whether they are new packets, packets that are part of
an existing connection, or packets logically related to an existing
connection. Besides greatly increasing security, this also vastly
simplifies filtering rulesets.
Scripts to control packet filtering are difficult to set up. Cluster
administrators would like to specify "block every incoming
connection attempt except for SSH to machines 1 and 2, and allow
all cluster machines to access the outside network." To do
this safely requires a script of dozens of lines and many commands,
with various network stack parameters written into the /proc directory
tree. Packet-filtering scripts are akin to an assembly language.
For things to run correctly, "glue" code must be added.
What systems administrators need is a method to "compile"
packet-filtering firewall scripts that requires no knowledge of
packet-filtering configuration commands. Given this need, we decided
to implement a packet-filtering compiler.
PFILTER -- A Firewall Compiler
The PFILTER (Packet FILTER) firewall compiler was developed and
implemented in the Perl language. Better interpretive languages
are available, but none are as widespread as Perl. The main PFILTER
program is installed as a Perl executable program at /usr/sbin/pfilter.
The remainder of the executable program is stored as included Perl
files in the /usr/lib/pfilter directory. Included files were chosen
instead of the more traditional Perl modules to ensure that PFILTER
functions reside in a more secure directory, and because the functions
are very specific to PFILTER and generally not useful to other programs.
PFILTER is an open source project hosted on SourceForge; see:
http://sourceforge.net/projects/pfilter/
Ruleset Files
The PFILTER executable program is designed to do as little as
possible. Instead, PFILTER uses built-in files called ruleset files
to do most of the compiling output scripts. The ruleset files are
editable text files that can be modified or added to by systems
administrators. Conditional text blocks and macros are heavily used.
Most of the "glue" code in the compiled output is specified
in the ruleset files, allowing for tremendous flexibility and support
for many types of Unix variants. New types of network services to
be filtered on or off are defined in ruleset files. Thus, systems
administrators can add new types of network services to be supported.
Because network services to be filtered on or off are defined
by macros, shell script fragments can be embedded in the compiled
output code, allowing for filtering services such as NFS, which
dynamically allocate port numbers. Because of the built-in constants,
variables, conditions, and macros, the main firewall configuration
file can be quite sophisticated, allowing the same configuration
file to be used throughout a cluster.
Compiled output scripts are optimized, while redundant or impossible
combinations of packet-filtering sources and destinations are not
written to the output scripts. The compiled output scripts include
verbose comments explaining what each script line does. The configuration
source lines that cause generation of output script lines are included
as comments immediately above their generated output.
The PFILTER Firewall Language
The PFILTER firewall configuration is defined in the /etc/pfilter.conf
file. Comments start with either the # or % characters and can be
complete lines. Comments that start with the % character are not
copied to the compiled output script, while comments starting with
the # character are copied when appropriate. All directives and
keywords are case-insensitive.
Constants and Variables
Named constants are defined like this:
%constant% constant_name strings ...
A constant's value will be set to everything after the name but
not including any comments at the end of the line. Trying to redefine
a constant with a new value produces an error. Variables are defined
as:
%variable% variable_name strings ...
Variables can be re-defined any number of times. To substitute a constant
or variable value, simply insert it with % characters on each side.
For example, these lines:
%variable% var b c
%constant% const d e f
a %var% %const% g
will expand to:
a b c d e f g
A number of constants are defined by PFILTER during each compilation.
Macros
PFILTER supports macros with named parameters. When a macro is
expanded, temporary variables matching the macro's parameter
names are created just for that expansion block. Macros are heavily
used when generating the compiled firewall output script. For example:
%macro% compute-node management-node
# compute node compute-node
open ssh from %domain%
open tcp 1024:65536 from %cluster%
open tcp 0:1023 from %management-node%
%endmacro
%compute-node% node17 mgmt12
would expand to:
# compute node node17
open ssh from %domain%
open tcp 1024:65535 from %cluster%
open tcp 0:1023 from mgmt12
Macro definitions and invocations can be nested.
Conditional Blocks
Text blocks or single lines can be conditionally included in the
output based on any of the following types of conditional expressions:
%ifdef name is true if constant/variable is defined
%ifnde name is true if constant/variable is undefined
%if string = string is true if strings match
%if string != string is true if strings do not match
%if string is true if the string is non-blank
Conditionals can either surround a block of lines or only affect one
line. If the conditional expression is at the beginning of a line,
the lines following it until a line starts with %endif are included
in the output if the condition is true. If the end of line is a conditional
expression, that line will be included in the output if the condition
is true. The conditional expressions and any possible %endif lines
are not included in the output.
Protocols, Ports, and Network Services
Some directives include lists of protocols, ports, or network
services. These lists can include any of the following, separated
by spaces or tab characters: a protocol and one or more port numbers,
port ranges, and network service names, network service name without
a proceeding protocol.
Network service names are defined in four ways. The first three
are:
1. A matrix of protocols, ports, and matching service names are
parsed from the /etc/services system file if it exists.
2. Some symbolic names are defined by the iptables and ipchains
commands (though their use is not advisable, since newer versions
of those commands might change or delete symbolic names).
3. Network services can be defined in either the main firewall
configuration file or in one of the firewall ruleset files as a
simple constant. For example, scattered in the ruleset files are
these lines:
%define service-x-protocols-ports tcp udp/6000:6063
%define service-ping-protocols-ports icmp/8
%define service-ssh-protocols-ports tcp/22
This defines the network service x as responding to both TCP and UDP
ports ranging from 6000 to 6063, defines the network service ping
as responding to ICMP packets of type 8, and defines the network service
ssh to respond to TCP port 22. A service name defined in this way
overrides any definitions in the /etc/services file. In the case of
the SSH ruleset entry, the /etc/services lines that define the SSH
service as both TCP and UDP ports 22, are overridden. Thus, when the
configuration file says:
open ssh
it will only open TCP port 22 and not open UDP port 22. The fourth
way a network service can be defined is with a pair of PFILTER macro
definitions. This allows network services to be opened or closed with
shell script fragments or to do something that isn't a list of
TCP and/or UDP ports or ICMP types. For example, one of the ruleset
files includes these segments:
%macro service-multicast-open source destination
# Let all multicast packets through from %sources%.
# The destination is always 224.0.0.0-239.255.255.255.
# This method is used because multicast packets are
# identified by their destination address.
%open_protocol_port% %source% 224.0.0.0/4 ANY ANY
%endmacro
%macro service-multicast-close source destination
# Block all multicast packets from %sources%.
# The destination is always 224.0.0.0-239.255.255.255.
# This method is used because multicast packets are
# identified by their destination address.
%close_protocol_port% %source% 224.0.0.0/4 ANY ANY
%endmacro
This segment defines how to open or close multicast packets going
through the firewall. If this line is put in the /etc/pfilter.conf
configuration file:
open multicast from mydomain.com
then the compiled output script will include directives that will
accept packets from mydomain.com going to anywhere in the address
range 224.0.0.0/4.
Network Addresses
Network addresses can be specified as simple IP addresses, IP
address ranges, DNS host names, or address ranges that include DNS
host names.
Directives to Open/Close Access
To allow or block incoming network connections, these directives
are used:
OPEN protocols-ports-services [from source(s)] [to destination(s)]
CLOSE protocols-ports-services [from source(s)] [to destination(s)]
If no source addresses are specified, incoming connections to the
specified protocols-ports and/or services are allowed (or blocked)
from any address. Absent any destination addresses, the connections
are allowed or blocked to the firewall machine. Destination addresses
can be specified for other machines if the firewall machine is receiving
and forwarding packets to the other machines.
Directives for Interface Attributes
The following directives specify attributes of network interfaces.
They all take one or more network attribute directives, followed
by one or more network interface names, such as:
FILTERED or UNTRUSTED -- All network connection attempts coming
from these interfaces are packet-filtered to determine whether the
connections should be allowed.
UNFILTERED or TRUSTED -- All network connection attempts coming
from these interfaces are allowed, without any packet filtering.
PRIVATE or PROTECTED -- Interfaces thus marked are presumed
to be on a private network that is protected by the firewall machine.
IP masquerading and NAT are applied to all outgoing connections
from these networks.
PUBLIC or UNPROTECTED -- Interfaces so marked are not protected
behind the firewall machine.
Directives to Set Up Host Aliasing
To set up host aliasing, where a machine on a protected network
is to be partially accessible on a public network, use this type
of directive:
ALIAS pseudo real [ ports-protocols-services] [from source[s]]
For example, if the SSH server on a machine named "private1"
on a protected network had to be accessible from machines in mydomain.com,
which had a 16-bit mask, appearing to be at public address "public1,"
the following line could be used:
ALIAS public1 private1 ssh from mydomain.com/16
Ports/protocols and/or services to be passed through could be listed
elsewhere than on the line that defined the ALIAS mapping. The line
above could also be:
ALIAS public1 private1
OPEN ssh from mydomain.com/16
Directives for Packet Forwarding
To set up packet forwarding/rewriting, the following line could
be included:
FORWARD protocols-ports-services [FROM source[s]]\
[TO destination[s]]ONTO forwarded-desistination-address \
[forwarded-protocols-ports-services]
A list of protocols/ports and/or services, possibly matched by where
they are coming from or going to, will be translated to go to the
forwarded-destination-address, possibly also to a different group
of protocols/ports/services.
Typical Cluster Firewall Example
As an example, consider a small cluster with one head node and
several compute nodes, with the head node and all the compute nodes
exposed on the general network. Suppose that the requirements for
outside network access to the cluster are that every cluster node
can be pingable and allow ssh access, with the head node also being
a Web server. Further suppose that any access to the outside network
from within the cluster is allowed, and that any access between
cluster nodes is allowed. The following PFILTER configuration file
on the head node and on every compute node would accomplish this:
# define the main cluster server and the compute nodes
# (assume they are defined in all host's /etc/hosts files)
%define server headnode.cluster19.domain.org
%define nodes node1.cluster19.domain.org \
node2.cluster19.domain.org \
node3.cluster19.domain.org
# We don't trust anyone anywhere on any interface by default
untrusted interfaces all
# Be nice and reject, rather than drop, unwanted packets
reject
# let anyone ping or ssh into the cluster
open ping ssh
# the server gets http opened up
open tcp http https %if %hostname% = %server%
# the server needs to be listed as a dhcp server for the nodes
# because opening up that service requires opening up some
# broadcast stuff as well, so simply listing the nodes as
# trusted is not sufficient
open dhcp to %nodes% %if %hostname% = %server%
# the server and every compute node trust each other
trusted %server% %nodes%
Conclusion
The first method of network security to be installed should be
packet filtering. This will block access from outside the cluster
to all but needed services, while allowing the cluster to access
the outside freely. With tools such as PFILTER, packet-filtering
firewalls with ipchains/iptables rulesets can be generated in a
straightforward manner.
Neil Gorsuch has worked as a software consultant in the process
control industry, co-founded a computer peripherals company where
he designed hardware, firmware, and host software, and worked for
Motorola designing cell phone firmware. He currently works at the
NCSA in the cluster development group writing clustering software
and cluster security systems. He is also a member of the Oscar core
development team.
|