The
PBS Accounting Toolkit
Rodney Mach
The Portable Batch System (OpenPBS) is a workload management package
used in many high-performance computing environments. OpenPBS reports
a plethora of information in an accounting log file that can be
used by systems administrators for capacity planning, resource reporting,
and performance tuning. Unfortunately, extracting information from
the accounting logs for those purposes has historically been a difficult
task. The example shown in Figure 1 is a typical accounting log
entry for an executed job generated by OpenPBS.
This example provides quite a bit of information. The typical
solution to extracting information from the accounting log would
be to first decipher the format, and then spend time writing Perl
code to get the information you need. However, this solution means
you spend more time retrieving your data instead of working with
your data. The PBS Accounting Toolkit solves this problem. It allows
administrators to convert the PBS accounting logs to XML to leverage
XML technologies for querying and parsing, and includes software
to generate high quality PDF usage reports from this XML accounting
data.
Installing the Software
Before installation, make sure you have Java 1.4 or higher installed.
You can type "java -version" at the Unix prompt to make sure the
version number you have is 1.4.X. You can install the latest Java
1.4 RPM from http://java.sun.com.
After confirming your Java installation, go to:
http://pbsaccounting.sourceforge.net
and obtain the latest version of the Accounting Toolkit. There are
two RPMs to install -- one is the XML conversion software, the other
is the Darkslide reporting package that I will discuss later.
XML Conversion
After installing the RPMs, you can convert the first accounting
file to XML. For example, to convert accounting data found in the
file /usr/spool/PBS/server_priv/accounting/20030523 to XML, type:
% cd /usr/spool/PBS/server_priv/accounting/
% mkdir xml
% pbstoxml 20030523 xml/20030523.xml
Figure 2 shows the resulting 20030523.xml file.
Converting the accounting data to XML opens up a powerful set
of XML tools for querying, parsing, converting, and storing the
data.
One example of leveraging an existing tool is using XPath to find
out how many nodes user rmach used. To do this, use the Perl XML:Xpath
module to do an XPath query against the XML data. (You could think
of XPath sort of like SQL for XML; see the references section for
a good tutorial on XPath.):
% xpath "sum(//pbs_jobfile/execution_record/resource_list/nodes)" < 20030523.xml
Query didn't return a nodeset. Value: 1
In this case, the XPath query returned 1 nodes were used by user rmach.
Writing a Perl script would have given the same answer, assuming it
was coded correctly, but using XPath is much faster and less prone
to error because it doesn't require any coding.
Darkslide Report Generation
The XML by itself isn't useful if you don't do something with
it, which is where the Darkslide Report Generator comes in. Darkslide
produces high-quality usage graphs and tables with which you can
easily visualize your accounting data (see Figure 3).
A high-level view of the Darkslide architecture is shown in Figure
4. XML accounting data is stored inside Xindice, a native XML database.
The database is accessed via XML:DB. The reporting engine issues
XPath commands against the database to gather data to produce the
PDF report output.
Darkslide Installation
The basic steps to getting up and running with the Darkslide Report
Generator are:
1. Install and configure Xindice.
2. Convert your OpenPBS accounting data to XML.
3. Load the converted XML accounting data into Xindice.
When this is complete, you will be able to generate the report.
Xindice Installation
DarkSlide uses Xindice, a native XML database, to store the accounting
data. Here is the procedure (also documented in the Xindice admin
guide) for setting up the database in preparation for storing the
accounting data.
Download and install the package xml-xindice-1.0.tar.gz from:
http://xml.apache.org/xindice:
% gunzip xml-xindice-1.0.tar.gz
% tar xf xml-xindice-1.0.tar
% mv xml-xindice-1.0 /usr/local/
% nohup /usr/local/xml-xindice-1.0/start &
If everything went well, you should get a message saying "Server Running".
Set important environment variables that Xindice requires:
% export PATH=$PATH:/usr/local/xml-xindice-1.0/bin
% export XINDICE_HOME=/usr/local/xml-xindice-1.0
Next, you can configure Xindice.
Configuring Xindice
At this point, you must create a "collection" in which to store
your XML accounting data. A collection is a storage location where
XML files will be stored. By convention, I just use the name of
the cluster that the accounting data is for. Remember what you use
here for the collection name, because later you will need to edit
the Darkslide configuration file to reflect the name of the collection
you chose. Here is an example of creating a collection for the cluster
named "examplecluster"; replace "examplecluster" with the name for
your cluster:
% xindiceadmin add_collection -c /db/ -n examplecluster
Created : /db//examplecluster
Configuring Darkslide
You will also need to edit the Darkslide configuration file QueryEngineConfig.xml
in /usr/local/darkslide-1.0/etc/. The most important parameter is
ensuring the <collection> tag has the same collection name
you created in the "Configuring Xindice" step. In this case, you
would change it to "examplecluster". You may also want to edit the
title tag <title> that modifies the title of the report.
If you have any problems, try lowering the debugging tag <debugging>
from SEVERE to FINEST to get copious amounts of debugging information.
The Config file after editing for our purposes is shown in Figure
5.
Loading XML Accounting Data
You can use the bulkloader command included with the Accounting
Toolkit to load all the XML files into the Xindice collection you
created. For example, if all your XML data was converted with pbstoxml
into the directory /usr/spool/PBS/server_priv/accounting/xml/, you
would type this command:
% bulkloader --directory=/usr/spool/PBS/server_priv/accounting/xml/
Document /usr/spool/PBS/server_priv/accounting/xml/20030501 inserted
Document /usr/spool/PBS/server_priv/accounting/xml/20030502 inserted
Document /usr/spool/PBS/server_priv/accounting/xml/20030503 inserted
Document /usr/spool/PBS/server_priv/accounting/xml/20030504 inserted
Document /usr/spool/PBS/server_priv/accounting/xml/20030505 inserted
To verify the documents were loaded properly, use the following Xindice
command to query the files in the collection named "examplecluster":
% xindiceadmin ld -c /db/examplecluster
The filenames that were just loaded into the collection should be
listed as output.
Generating the Report
It's finally time to produce some reports. To generate a report
from 4-10-2003 through 04-29-2003 and save the report in a file
called /tmp/test.pdf, type the following command:
/usr/local/darkslide-1.00/bin/reportgenerator \
--startdate=04-10-2003 --enddate=04-29-2003 --filename=/tmp/test.pdf
You can now use your favorite PDF viewer to view the report.
Daily Updates
To load the daily XML accounting data you converted with pbstoxml
into Xindice, simply create a daily cron job. The cron job should
load the XML document into Xindice using the Xindice command here:
xindice add_document -c /db/examplemachine -f \
/usr/spool/PBS/server_priv/accounting/xml/filename -n filename
where the filename is the name of the accounting file, such as 20030523.
Conclusion
Although there is certainly a bit of a learning curve to XML technologies
in the beginning, the time invested in learning them pays back dividends
in the long run. Leveraging tools provided in the PBS Accounting
Toolkit gives you the power to visualize accounting information
quickly and easily, harnessing the power of XML tools and technologies
to give you the decision-making information you need.
References
PBS Accounting Toolkit -- http://pbsaccounting.sourceforge.net
OpenPBS -- http://www.openpbs.org
Xindice -- http://xml.apache.org/xindice
XML tutorials -- http://www.w3schools.com
Rodney Mach is President of Fathom5 Consulting (http://www.fathom5consulting.com),
a technology firm specializing in providing fast affordable custom
software solutions. He can be reached at: rmach@fathom5consulting.com.
|