Where did that Core File Come From?
Chris Hare
Whenever UNIX encounters a fatal error, it dumps an
image of the halted
process into a file named core. Many different types
of errors
can cause a core dump. The most common are memory violations,
illegal
instructions, bus errors, and user-generated quit signals.
Core files
can be nasty disk consumers, especially since many core
files are
essentially worthless -- the product of quit signals
from over-zealous
users (who think the system is hung if their 100-megabyte
file doesn't
sort instantly).
However, some core files are extremely important. To
a working developer,
the core file holds a vast wealth of debugging information:
by examining
the core file, he can find out exactly what his program
was doing
when it was halted.
For the most part, however, when we system administrators
find these
annoying core files, we have no idea what caused them
and thus, no
reliable basis for deciding whether to delete them.
In 1988, a program called psc appeared on USENET. This
tool
looks inside the core file and extracts information
about the core
dump, including important clues to the origin of the
core file. This
article explains psc and the related file structures.
As part of each process, UNIX maintains a per process
user area
which contains the user environment. This user environment
defines
the process, and is part of the information which is
saved as part
of a core dump. The per-process user area also includes
the registers
as they were at the time of the fault. The actual size
of the user
area is implementation-dependent, but is defined in
the system include
file /usr/include/sys/param.h. The remainder of the
core file
represents the actual contents of the user's memory
when the image
was written to disk.
The system header file /usr/include/sys/user.h describes
the
per-process user area, and /usr/include/sys/reg.h gives
the
location of the register values.
The program is fairly straight forward, amounting to
only 81 lines
(Listing 1).
By default, psc opens a file named "core"
if it exists
in the current directory. Alternatively, the user may
use a command
line path argument to specify which file is to be examined.
psc
opens the file, and loads the per user process structure
with the
data from the file's user area.
Figure 1 shows a typical psc report. The effective and
real
user ID numbers tell which user was running the process
when the process
died. Note that if the effective user ID is different
from the real
user ID, a core image will not be generated when the
fault occurs.
The "process times" section reports the parent
and child process
times which had accumulated prior to the dump. The user
time indicates
the amount of time which was spent operating in user
mode (as opposed
to kernel or system mode).
The "process misc" section gives the tty major
and
minor numbers and the address of the process structure
for the process
which created this core dump. On its own, the process
structure address
may not be useful, but armed with this information and
a debugger,
you can peruse the entire process slot entry. The controlling
tty
major and minor numbers tell who was executing the program,
and from
what terminal.
The IPC section reports on the active interprocess communication
locks.
A value of "proc" in this section indicates
that the process
was locked into RAM, "text" indicates that
the text portion
was locked, and "data" indicates that the
data portion was
locked. Unless the offending program specifically asked
the kernel
to lock its text or data area, this section will report
"unlocked."
The FILE I/O section defines the output parameters which
were active
when the program crashed. The base I/O address points
to the I/O control
structure. "Offset" is the file offset at
the time. "Bytes
remaining" reports how much data was left when
the process aborted.
This section also reports what umask was being applied
to files which
were created by this program. The ulimit value indicates
the maximum
file size (in blocks) which could be created by this
application.
The "accounting" section reports which process
created this
core file, how much memory was used by the process,
whether the process
was created via fork or exec system call, and when the
process started.
Conclusion
Though psc is a simple program, it is an important system
administrator's
tool. The information psc reports can prove invaluable
in identifying
the cause of a core dump and in determining whether
a particular core
file needs to be preserved.
About the Author
Chris Hare is Ottawa Technical Services Manager for
Choreo
Systems, Inc. He has worked in the UNIX environment
since 1986 and
in 1988 became one the first SCO authorized instructors
in Canada.
He teaches UNIX introductory, system administration,
and programming
classes. His current focus is on networking, Perl, and
X.
|