Deploying
Modules
Dale Southard
Under most current Unix/X11 systems, the user experience is primarily
controlled by four groups of settings:
- The user's choice of shell interpreter (csh, sh, bash, ...).
- The setting of environment variables like $PATH, $MANPATH,
$INFOPATH, $LD_LIBRARY_PATH, and $LM_LICENSEFILE.
- Shell aliases and macros.
- The X11 resources available through the X11 file search path
or through xrdb.
Typically, these settings are statically configured by sourcing
"rc" files located in system directories like /etc, or in the user's
home directory when the user either logs in or launches a shell.
For simple environments, this system works well in both standalone
systems and moderate-sized networks.
Unfortunately, in larger, more complex environments that must
serve a broader spectrum of users, a single statically configured
environment can present problems for both users and systems administrators.
Specifically, the following problems effect most large statically
configured environments to varying degrees:
1. Applications are not "version controlled" -- Users are provided
a single version of most common apps. If the app is upgraded, the
newer version is both deployed (made available for use) and
committed (made the default version for all users) at the
same time. This means that updates generally require extensive testing
prior to deployment to ensure that they do not adversely affect
any existing users.
2. Central points of failure -- All applications maintainers require
modification privileges on the network "bin" directories and rc
files in order to make their apps available to users. The larger
number of people working with the same directories and files increases
both the opportunity for making mistakes and the effects of those
mistakes on the user community. For example, a single error in a
centralized rc file can interrupt things for all users, even those
whose work-flow was otherwise independent of the application maintained
by the person who made the error.
3. Namespace issues -- Software packages often contain applications
that conflict with the names of applications in other packages.
This can become confusing for users, since resolving the conflict
often means renaming an application in ways that do not match the
documentation. Worse yet, it is easy for application maintainers
to overwrite each other's executables or shell variables when deploying
new packages.
4. Shell dependence -- On well-configured systems, users are free
to choose their shell interpreter based on personal preference.
Unfortunately, many application maintainers, and several commercial
applications, only support their application under a single shell
or family of shells.
The modules package provides an answer to these problems by providing
users with a dynamically reconfigurable environment that is independent
of their choice of shells. Rather than installing all applications
in a common space managed by a single rc file, the use of modules
encourages installation of each version of each application in its
own directory tree (see the sidebar "Installing Applications in
Custom Locations"). It encapsulates the environment changes for
each version of each application in a separate modulefile that the
user can load, unload, or query using the module command.
The modules package itself is open source and available via the
Internet. It was originally described in John L. Furlani's paper
(LISA Conference, 1991) as a set of functions for csh and sh shells,
but the modules package has become a standalone application with
an embedded Tcl/TclX scripting language and now supports most common
shell interpreters including csh, tcsh, sh, ksh, bash, zsh, and
Perl. Additional features include enhanced logging, modulefile tracing,
and support of hierarchical modulefile organization. Recent releases
also include more advanced configuration options and the ability
to do apropos-style searches of the installed modulefiles.
This article will focus on the basics of using and configuring
modules from the viewpoint of users, systems administrators, and
application maintainers. The examples used in this article are from
a system running modules 2.2.2.4 under IRIX, but were also tested
against release 3.1.6 under Debian Linux. In the examples, the module
files were installed in a directory tree located at /depot/modulefiles,
but sites could easily use other locations.
Modules for Users
The core of the modules system is the module command interpreter.
During shell initialization, a shell alias or macro that calls the
module command is defined. When this macro is called, the module
interpreter runs the specified modules sub-command, which in turn
performs actions based on one or more modulefiles.
From a user's prospective, the most important function of modules
can be found in four of the modules sub-commands:
- The module list sub-command shows which modules are
currently loaded in the user's shell.
- The module load and module unload sub-commands
are used to make a software package available for use, or to remove
a package and its settings from the user's environment.
- The module swap sub-command is equivalent to a module
unload followed by a module load. It is typically used
to switch between different versions of the same package.
- The module avail sub-command lists all the modulefiles
available for the user to load/unload/swap. This provides a mechanism
for users to "discover" newly upgraded/installed software packages
without resorting to email/Web/paper memos or lists.
The following transcript illustrates use of the avail, list, and
swap modules commands in a simplified modules setup:
myhost% module avail
--------- /depot/modulefiles ---------
MIPSpro purify
MIPSpro/7.2.1.3 purify/2002.05.00
MIPSpro/7.3 totalview
MIPSpro/7.3.1.1 totalview/4.1.0-1
MIPSpro/7.3.1.2 totalview/4.1.0-6
MIPSpro/7.3.1.3 totalview/5.0.0-1
modules totalview/5.0.0-4
modules/2.2.2.4
myhost% module list
Currently Loaded Modulefiles:
1) modules 2) MIPSpro/7.3
myhost% cc -version
MIPSpro Compilers: Version 7.30
myhost% module swap MIPSpro MIPSpro/7.3.1.2
Switching 'MIPSpro/7.3' to 'MIPSpro/7.3.1.2'...ok.
myhost% cc -version
MIPSpro Compilers: Version 7.3.1.2m
The modules package also provides mechanisms for users to incorporate
their own collections of modulefiles as well as set up the default
modules they load at login and display what changes a modulefile will
make to their enviroment. The newest release also allows searching
though a whatis-style database for related modulefiles. These
advanced features are beyond the scope of this introductory article.
Modules Initialization
Installing modules for use on a system requires initializing modules
during startup of the user's shell and providing one or more modulefiles
for loading/unloading/swapping. Initialization of modules is done
by adding a stanza like the following to the appropriate cshrc or
profile file:
#
# setup csh for modules and load/unload some modulefiles
#
if (-r /depot/modules/modules/init/csh) then
source /depot/modules/modules/init/csh
module load modules
module load default
module unload totalview
endif
When deploying modules, it is important to consider where modules
will be initialized. Two obvious choices are the centralized rc file
for each shell (e.g., /etc/profile for sh), or the user's home directory
rc files (e.g., ~/.cshrc for csh).
Using the centralized rc files is somewhat easier to implement.
Using the centralized rc files also provide a nice hierarchy for
customizing the modulefiles that users are given by default:
1. A network-wide "default" modulefile contains the modules sub-commands
to load a standard set of modulefiles.
2. Each system can modify this default set of modulefiles via
the addition of modules sub-commands to the same rc files where
modules is initialized (e.g., in the above example the "totalview"
application is unloaded after default modulefile is loaded).
3. Individual users can add module sub-commands to their startup
files to further customize the starting set of modulefiles in their
account.
Unfortunately, using the centralized rc files for modules initialization
comes with one huge disadvantage -- the centralized rc files are
not sourced when a non-login shell is executed. That means that
subshells launched from within applications like vi or emacs will
not initialize modules and thus will not allow users to utilize
modules commands. Even more confusing for advanced users, launching
remote applications via rsh/ssh will also fail to initialize modules.
Finally, there are some module sub-commands like initadd
and initrm, which will not work correctly unless the modules
initialization is present in the user's home directory rc files.
All these problems are not present if the modules system is initialized
from the rc files in each user's home directory.
Note that it is possible (though not recommended) to initialize
modules in both the centralized and home directory rc files. In
fact, I've used such a setup for several years without problems.
Modules Modulefiles
The modules system uses one modulefile for each version of each
software package that will be managed by modules. The modulefiles
themselves contain the commands necessary for making an application
available to users, including modifying the $PATH or other environment
variables, setting aliases, and making resources available to X11.
In version 2 and later, these modulefiles are written in the Tcl
scripting language and begin with the string "#%Module". The modules
system has provided several extensions to Tcl that are specific
to managing user environments:
setenv -- Sets an environment variable in the shell.
prepend-path -- Adds a value at the beginning of a colon-separated
list.
append-path -- Adds a value at the end of a colon-separated
list.
set-alias -- Creates an alias using sh-style args.
x-resource -- Merges a resource into the X11 resource db.
module -- Allows a modulefile to execute other modules
sub-commands.
uname -- Provides access to host info on the target system.
The first four commands perform an inverse of their normal function
when called by a module remove sub-command. Most modulefiles
contain fewer than a dozen lines of code. For example, the modulefile
below would suffice for controlling an installation of XEmacs:
#%Module
#
# xemacs 21.4.13 modulefile
#
########## standard defs ##########
set sys [uname sysname]
set os [uname release]
set mach [uname machine]
##### program specific stuff ######
set root "/depot/emacs/xemacs-21.4.13"
prepend-path PATH $root/bin
prepend-path MANPATH $root/man
prepend-path INFOPATH $root/lib/xemacs/info
The modulefiles are usually gathered into directory trees that are
grouped by application. Within these directories, the modulefiles
are generally named using the version of the application. For example:
/depot/modulefiles/totalview:
4.1.0-1
4.1.0-6
5.0.0-1
5.0.0-4
/depot/modulefiles/MIPSpro:
7.2.1.3
7.3
7.3.1.1
7.3.1.2
7.3.1.3
With a structure like this, modulefiles can be referred to as "application/version".
For example, the command module load totalview/5.0.0-1 would
load the modulefile at /depot/modulefiles/totalview/5.0.0-1/.
If the modulefile is not fully qualified (e.g., module load
totalview), modules will load the lexicographically highest
modulefile name (in our example, /depot/modulefiles/totalview/5.0.0-4).
If desired, a ".version" file can be used to override the lexicographical
sort. For example, if the following .version file were in the /depot/modulefiles/totalview
directory listed above, the command module load totalview
would default to the totalview/4.1.0-6 modulefile rather than the
lexicographically higher totalview/5.0.0-4 modulefile:
#%Module
set ModulesVersion 4.1.0-6
Note that the ModulesVersion variable refers to the version file that
should be searched for in the hierarchy rooted in the directory where
the .version file was encountered, not the version of modules that
the user is running.
Modules in Practice
Once configured, the modules system provides a mechanism to address
the issues mentioned in the introduction to this article:
1. Applications are "version controlled". Users can switch between
installed versions on the fly. In practice, this vastly simplifies
the process of deploying new software for sys admins. New packages
are simply installed and a new modulefile is created. "Friendly
users" can test the cutting edge versions at their leisure while
those in the trenches can continue to access older versions as long
as needed. No one is ever tied to a "default" version.
2. Application maintainers no longer need modification privileges
on common directories and rc files in order to maintain software.
Instead, they are given two directories: a directory in which to
install their software releases (e.g., /depot/xemacs), and a directory
in which to install their modulefiles (e.g., /depot/modulefiles/xemacs).
Since these directories are not shared with other maintainers, the
impact of their activities is limited to the applications they are
maintaining. Other software packages are unaffected by any errors
that may occur.
3. Namespace issues are under the control of the user. Because
users can control which modules are loaded and in what order, name
conflicts are either eliminated entirely or can be unambiguously
controlled.
4. Modules are shell-independent. A single modulefile will work
for csh, tcsh, sh, ksh, bash, zsh, and Perl users.
Beyond solving these problems, the modules package also provides
a more powerful environment for users and developers. Here are some
examples of setups I've used:
- Providing the environment for the ubiquitous /usr/local tree
as a modulefile allows developers to module unload it prior
to compiling software that shouldn't be linked to non-system libraries.
- Providing a GNU modules system that prepends the common GNU
commands to the head of the user's environment. This provides
an easy route for GNU/Linux users to experience a familiar environment
even when they are working on a non-GNU system like Solaris or
AIX.
- Providing "meta modulefiles" that use multiple module
commands to load customized environments for different tasks like
development, graphic design, quantum chemistry, or CAD/CAM work.
- Providing multiple, highly customized versions of the same
package to better meet the needs of specific researchers or groups.
This is a common need in quantum chemisty and other disciplines
where researchers need specialized versions of common software
to handle the particulars of their project.
Additionally, the modules system forms the underpinnings of the
OSCAR Linux clustering effort's "env-switcher" application. The
combination of modules plus env-switcher allows users to access
all the power of the underlying modules system for on-the-fly environment
changes, as well as providing a mechanism for user-controlled environment
changes that are persistent across logins and ssh/rsh remote program
invocations.
From my experience, the single largest benefit of modules is the
time the system saves. Because modulefiles allow deploying
applications without committing the entire user base to using
them, tasks that previously required extensive planning, testing,
and administrative overhead to prevent versioning conflicts can
instead be solved immediately by simply installing the software
in parallel under modules control. This is appreciated by the users
as well as the systems administrators and application maintainers.
References
http://modules.sourceforge.net
http://hpcf.nersc.gov/software/os/modules.html
http://env-switcher.sourceforge.net/
Dale Southard is currently a sys admin with the Advanced Simulation
and Computing VIEWS project at Lawrence Livermore National Laboratory
in Livermore, California. He can be reached by email at: dsouth@llnl.gov.
This work was performed under the auspices of the U.S. Department
of Energy by the University of California, Lawrence Livermore National
Laboratory under contract No. W-7405-Eng-48.
|