Article

aug2003.tar

Industrial Strength Cluster Security for an Open Source Price

Neil Gorsuch

Computer security is big business. Worldwide annual revenue of the VPN/Firewall market was $2.7 billion in 2002 (Source: Infonetics Research, Inc.). Upfront licensing costs for proprietary industrial strength cluster security solutions can range from thousands to hundreds of thousands of dollars depending on cluster size. However, use and deployment of open source solutions can reduce the total cost of ownership. This paper describes the deployment of an industrial strength open source firewall solution based on an easily configurable packet-filtering compiler system for clusters.

Stateful packet-filtering firewalls can provide excellent security from network attacks, but are difficult at best to set up and maintain. When packet filtering is combined with packet forwarding, NAT'ing, and pseudo-interfaces, a single machine can provide firewall protection for a private network of machines. Thus, protected machines have complete access to the general networks and visibility at general network addresses, while maintaining their firewall protection. A configurable packet-filtering compiler system for clusters can provide all these benefits.

Introduction

Total cost of ownership (TCO) is often overlooked in designing and deploying large-scale computing solutions. A potentially significant percentage can be added to the TCO when security costs, both initial and ongoing, are factored in. Industrial strength cluster security solutions can entail upfront licensing costs that quickly multiply when proprietary security solutions are licensed on a per-node basis (16x, 32x, 64x, 128x...). Using and deploying open source solutions can lower the TCO by eliminating the software licensing cost component of the equation.

The NCSA Cluster and Software Tools Group co-located at the University of Illinois at Urbana-Champaign is a pioneer in the area of cluster design and deployment. A founding member of Open Cluster Group (http://www.openclustergroup.org) and the Gelato Federation (http://www.gelato.org), the group is working in partnership with other institutions to develop software (OSCAR and GOLD) that greatly simplifies the task of installing and running parallel Linux clusters that are compatible with large-scale production systems. A key component of their work is cluster security. This article describes their work on an industrial strength open source firewall solution based on a configurable packet-filtering compiler system.

Security Layers

For effective cluster security, multiple security layers should be provided. Cluster network security layers should ideally consist of: router filtering, network stack protections, IP masquerading and NAT'ing, packet filtering, disabling unused network services, TCPwrappers, and configuring applications.

Packet filtering is the process of examining each network packet as it comes into or through a device, and either allowing, dropping, or rejecting the packet based on various factors: source and destination addresses, port numbers, and whether the packet initiates a new connection attempt. Packet filtering can block all incoming network connection attempts except those explicitly allowed. This prevents computer configuration mistakes such as allowing an insecure network service to be accessed by mistake.

Stateful packet filtering keeps track of each network communication sequence of packets and provides for simpler, more secure firewalls that are easier to set up and maintain.

All cluster machines that receive packets from outside the cluster should have their own packet filtering installed, or a common firewall machine should filter all packets entering and leaving the cluster.

It is difficult to use packet filtering to set up a firewall on a computer system -- comparable to writing programs in assembly language. Systems administrators need a method to set up and maintain good packet-filtering firewalls without having to learn the intricacies of packet-filtering commands.

Packet Re-Writing

Linux and other operating systems can modify network packets as they pass through a system. This ultra-sophisticated type of NAT can yield some very useful results. Since this process is controlled through the packet-filtering rules, a packet-filtering compiler must control it.

Host Aliasing

Consider a firewall machine with multiple network interfaces that is acting as an IP-masquerading firewall for one of the interfaces. In some cases, it is advantageous to access one or more of the machines that are on the private network through a public address. However, if the desired access method for more than one of the machines is through the same type of service, those machines must still appear as if they were also on the public network, so only outside connections to the desired services are passed on to the private address of the machines. This allows one firewall machine to act as a gateway for several hidden machines, and all firewalling and packet-filtering protection can be set up on one machine.

Packet Redirection or Forwarding

Sometimes packets must be rewritten and sent to a different destination than intended (e.g., having all outgoing packets destined for Web servers redirected to a proxy Web server on a local machine).

Packet-Filtering Details

Linux kernels use "chains" of packet-filtering rules to filter packets. Each filtering rule has a set of matching conditions and a set action if the packet is matched. Packets can be matched by source address, destination address, source port, destination port, TCP flags, type, MAC address, length, throttling limits, specific byte patterns, and user id of locally generated packets. Some chains are predefined and always exist; others can be defined or deleted by the user. Rules can be added to any chain, and any chain can have all its rules flushed from it. Packets can traverse more than one chain.

Linux version 2.4 and later kernels support integrated connection tracking and stateful inspection. This adds more ways to match packets, based on whether they are new packets, packets that are part of an existing connection, or packets logically related to an existing connection. Besides greatly increasing security, this also vastly simplifies filtering rulesets.

Scripts to control packet filtering are difficult to set up. Cluster administrators would like to specify "block every incoming connection attempt except for SSH to machines 1 and 2, and allow all cluster machines to access the outside network." To do this safely requires a script of dozens of lines and many commands, with various network stack parameters written into the /proc directory tree. Packet-filtering scripts are akin to an assembly language. For things to run correctly, "glue" code must be added.

What systems administrators need is a method to "compile" packet-filtering firewall scripts that requires no knowledge of packet-filtering configuration commands. Given this need, we decided to implement a packet-filtering compiler.

PFILTER -- A Firewall Compiler

The PFILTER (Packet FILTER) firewall compiler was developed and implemented in the Perl language. Better interpretive languages are available, but none are as widespread as Perl. The main PFILTER program is installed as a Perl executable program at /usr/sbin/pfilter. The remainder of the executable program is stored as included Perl files in the /usr/lib/pfilter directory. Included files were chosen instead of the more traditional Perl modules to ensure that PFILTER functions reside in a more secure directory, and because the functions are very specific to PFILTER and generally not useful to other programs. PFILTER is an open source project hosted on SourceForge; see:

http://sourceforge.net/projects/pfilter/

Ruleset Files

The PFILTER executable program is designed to do as little as possible. Instead, PFILTER uses built-in files called ruleset files to do most of the compiling output scripts. The ruleset files are editable text files that can be modified or added to by systems administrators. Conditional text blocks and macros are heavily used.

Most of the "glue" code in the compiled output is specified in the ruleset files, allowing for tremendous flexibility and support for many types of Unix variants. New types of network services to be filtered on or off are defined in ruleset files. Thus, systems administrators can add new types of network services to be supported.

Because network services to be filtered on or off are defined by macros, shell script fragments can be embedded in the compiled output code, allowing for filtering services such as NFS, which dynamically allocate port numbers. Because of the built-in constants, variables, conditions, and macros, the main firewall configuration file can be quite sophisticated, allowing the same configuration file to be used throughout a cluster.

Compiled output scripts are optimized, while redundant or impossible combinations of packet-filtering sources and destinations are not written to the output scripts. The compiled output scripts include verbose comments explaining what each script line does. The configuration source lines that cause generation of output script lines are included as comments immediately above their generated output.

The PFILTER Firewall Language

The PFILTER firewall configuration is defined in the /etc/pfilter.conf file. Comments start with either the # or % characters and can be complete lines. Comments that start with the % character are not copied to the compiled output script, while comments starting with the # character are copied when appropriate. All directives and keywords are case-insensitive.

Constants and Variables

Named constants are defined like this:

%constant%	constant_name    strings ...

A constant's value will be set to everything after the name but not including any comments at the end of the line. Trying to redefine a constant with a new value produces an error. Variables are defined as:

%variable%	variable_name    strings ...

Variables can be re-defined any number of times. To substitute a constant or variable value, simply insert it with % characters on each side. For example, these lines:

%variable%    var        b    c
%constant%    const      d    e    f
a    %var%    %const%    g

will expand to:

a    b    c    d    e    f    g

A number of constants are defined by PFILTER during each compilation.

Macros

PFILTER supports macros with named parameters. When a macro is expanded, temporary variables matching the macro's parameter names are created just for that expansion block. Macros are heavily used when generating the compiled firewall output script. For example:

%macro%      compute-node     management-node
# compute node compute-node
open    ssh    from    %domain%
open    tcp    1024:65536    from    %cluster%
open    tcp    0:1023    from   %management-node%
%endmacro

%compute-node%    node17        mgmt12
would expand to:
# compute node node17
open    ssh    from    %domain%
open    tcp    1024:65535    from    %cluster%
open    tcp    0:1023    from    mgmt12

Macro definitions and invocations can be nested.

Conditional Blocks

Text blocks or single lines can be conditionally included in the output based on any of the following types of conditional expressions:

%ifdef   name              is true if constant/variable is defined
%ifnde   name              is true if constant/variable is undefined
%if      string = string   is true if strings match
%if      string != string  is true if strings do not match
%if      string            is true if the string is non-blank

Conditionals can either surround a block of lines or only affect one line. If the conditional expression is at the beginning of a line, the lines following it until a line starts with %endif are included in the output if the condition is true. If the end of line is a conditional expression, that line will be included in the output if the condition is true. The conditional expressions and any possible %endif lines are not included in the output.

Protocols, Ports, and Network Services

Some directives include lists of protocols, ports, or network services. These lists can include any of the following, separated by spaces or tab characters: a protocol and one or more port numbers, port ranges, and network service names, network service name without a proceeding protocol.

Network service names are defined in four ways. The first three are:

1. A matrix of protocols, ports, and matching service names are parsed from the /etc/services system file if it exists.

2. Some symbolic names are defined by the iptables and ipchains commands (though their use is not advisable, since newer versions of those commands might change or delete symbolic names).

3. Network services can be defined in either the main firewall configuration file or in one of the firewall ruleset files as a simple constant. For example, scattered in the ruleset files are these lines:

%define service-x-protocols-ports tcp udp/6000:6063
%define service-ping-protocols-ports icmp/8
%define service-ssh-protocols-ports tcp/22

This defines the network service x as responding to both TCP and UDP ports ranging from 6000 to 6063, defines the network service ping as responding to ICMP packets of type 8, and defines the network service ssh to respond to TCP port 22. A service name defined in this way overrides any definitions in the /etc/services file. In the case of the SSH ruleset entry, the /etc/services lines that define the SSH service as both TCP and UDP ports 22, are overridden. Thus, when the configuration file says:

open ssh

it will only open TCP port 22 and not open UDP port 22. The fourth way a network service can be defined is with a pair of PFILTER macro definitions. This allows network services to be opened or closed with shell script fragments or to do something that isn't a list of TCP and/or UDP ports or ICMP types. For example, one of the ruleset files includes these segments:

%macro service-multicast-open source destination
# Let all multicast packets through from %sources%.
# The destination is always 224.0.0.0-239.255.255.255.
# This method is used because multicast packets are
# identified by their destination address.
  %open_protocol_port% %source% 224.0.0.0/4 ANY ANY
%endmacro

%macro service-multicast-close source destination
# Block all multicast packets from %sources%.
# The destination is always 224.0.0.0-239.255.255.255.
# This method is used because multicast packets are
# identified by their destination address.
 %close_protocol_port% %source% 224.0.0.0/4 ANY ANY
%endmacro

This segment defines how to open or close multicast packets going through the firewall. If this line is put in the /etc/pfilter.conf configuration file:

open    multicast    from    mydomain.com

then the compiled output script will include directives that will accept packets from mydomain.com going to anywhere in the address range 224.0.0.0/4.

Network Addresses

Network addresses can be specified as simple IP addresses, IP address ranges, DNS host names, or address ranges that include DNS host names.

Directives to Open/Close Access

To allow or block incoming network connections, these directives are used:

OPEN protocols-ports-services [from source(s)] [to destination(s)]
CLOSE protocols-ports-services [from source(s)] [to destination(s)]

If no source addresses are specified, incoming connections to the specified protocols-ports and/or services are allowed (or blocked) from any address. Absent any destination addresses, the connections are allowed or blocked to the firewall machine. Destination addresses can be specified for other machines if the firewall machine is receiving and forwarding packets to the other machines.

Directives for Interface Attributes

The following directives specify attributes of network interfaces. They all take one or more network attribute directives, followed by one or more network interface names, such as:

FILTERED or UNTRUSTED -- All network connection attempts coming from these interfaces are packet-filtered to determine whether the connections should be allowed.

UNFILTERED or TRUSTED -- All network connection attempts coming from these interfaces are allowed, without any packet filtering.

PRIVATE or PROTECTED -- Interfaces thus marked are presumed to be on a private network that is protected by the firewall machine. IP masquerading and NAT are applied to all outgoing connections from these networks.

PUBLIC or UNPROTECTED -- Interfaces so marked are not protected behind the firewall machine.

Directives to Set Up Host Aliasing

To set up host aliasing, where a machine on a protected network is to be partially accessible on a public network, use this type of directive:

ALIAS  pseudo   real  [ ports-protocols-services] [from source[s]]

For example, if the SSH server on a machine named "private1" on a protected network had to be accessible from machines in mydomain.com, which had a 16-bit mask, appearing to be at public address "public1," the following line could be used:

ALIAS    public1    private1    ssh from mydomain.com/16

Ports/protocols and/or services to be passed through could be listed elsewhere than on the line that defined the ALIAS mapping. The line above could also be:

ALIAS    public1    private1
OPEN     ssh     from    mydomain.com/16

Directives for Packet Forwarding

To set up packet forwarding/rewriting, the following line could be included:

FORWARD protocols-ports-services [FROM source[s]]\
 [TO destination[s]]ONTO forwarded-desistination-address \
[forwarded-protocols-ports-services]

A list of protocols/ports and/or services, possibly matched by where they are coming from or going to, will be translated to go to the forwarded-destination-address, possibly also to a different group of protocols/ports/services.

Typical Cluster Firewall Example

As an example, consider a small cluster with one head node and several compute nodes, with the head node and all the compute nodes exposed on the general network. Suppose that the requirements for outside network access to the cluster are that every cluster node can be pingable and allow ssh access, with the head node also being a Web server. Further suppose that any access to the outside network from within the cluster is allowed, and that any access between cluster nodes is allowed. The following PFILTER configuration file on the head node and on every compute node would accomplish this:

# define the main cluster server and the compute nodes
# (assume they are defined in all host's /etc/hosts files)

%define server headnode.cluster19.domain.org

%define nodes     node1.cluster19.domain.org \
        node2.cluster19.domain.org \
        node3.cluster19.domain.org

# We don't trust anyone anywhere on any interface by default

untrusted  interfaces  all

# Be nice and reject, rather than drop, unwanted packets

reject

# let anyone ping or ssh into the cluster

open    ping    ssh

# the server gets http opened up

open    tcp     http https        %if %hostname% = %server%

# the server needs to be listed as a dhcp server for the nodes
# because opening up that service requires opening up some
# broadcast stuff as well, so simply listing the nodes as
# trusted is not sufficient

open    dhcp    to    %nodes%    %if %hostname% = %server%

# the server and every compute node trust each other

trusted %server% %nodes%

Conclusion

The first method of network security to be installed should be packet filtering. This will block access from outside the cluster to all but needed services, while allowing the cluster to access the outside freely. With tools such as PFILTER, packet-filtering firewalls with ipchains/iptables rulesets can be generated in a straightforward manner.

Neil Gorsuch has worked as a software consultant in the process control industry, co-founded a computer peripherals company where he designed hardware, firmware, and host software, and worked for Motorola designing cell phone firmware. He currently works at the NCSA in the cluster development group writing clustering software and cluster security systems. He is also a member of the Oscar core development team.