Tales
from the Abyss: UNIX File Recovery
Liam Widdowson and John Ferlito
It is every systems administrator's nightmare -- an important
file has been accidentally deleted, falling into the deep abyss
where bits and bytes go to die. Typically, this situation presents
an inconvenience rather than a tragedy as the files can be recovered
from regular backup media.
However, if a systems administrator or developer has significantly
modified the file in question since the time of last backup then
a large amount of data may be lost. There are also other situations
that systems administrators may face, such as incorrectly configured
backup scripts, backup hardware failure, or plain old bad luck.
Nothing can replace a proper backup strategy, but this article will
outline how whole or partial files can be restored directly from
the UNIX file system.
The UNIX File System
Files in UNIX file systems are logical containers of data. Each
file has an inode (index-node) structure associated with it that
contains meta-data such as the physical disk blocks the file is
stored on, the file owner, permissions, size, etc. [1]. When files
are removed, the inodes are not erased from disk but are marked
as free. The actual data contained in the files is still on disk
and can potentially be retrieved before being over-written with
new data.
A Tale from the Abyss
Some time ago, a graduate in my team was tasked with developing
a Perl CGI-based provisioning and support interface to a product
we were developing. Earlier on in the project I showed her how she
could use UNIX pipes and redirection to debug her Perl script from
a shell rather than exclusively through the Web server. Unfortunately,
one day she accidentally typed the following:
$ ./nph-www.pl > nph-www.pl
That command resulted in the Perl code being turned into a zero-byte
file. Unfortunately, she had not checked the file into CVS for a week,
which meant around 600 lines of code had been lost. When she told
me what had occurred, I explained that it wasn't a problem because
we could retrieve the file from last night's backup and she would
only have to re-create today's changes.
I phoned our sysadmin and asked him to restore the file in question.
However, I did not get the response I hoped for -- the sysadmin
said he had never heard of the development server in question and
so it was not being backed up.
The graduate sighed and said that it was impossible to restore
the deleted file from the UNIX file system, so she would spend the
weekend re-coding the script. I told her that it was indeed possible,
and that I'd have her file restored in a few hours. Note that
this type of bad luck does not only occur to inexperienced administrators.
A number of years ago, I accidentally typed crontab -d instead
of crontab -e on a Linux system. How many readers back up
/var/spool/cron/crontabs?
UNIX File Recovery
As soon as any files have been accidentally erased, all I/O activities
on the relevant partition should cease. The partition should be
unmounted, or the system should be placed into single-user mode
immediately after the incident. If this is not possible (e.g., the
root file system cannot be unmounted), then all logged-in users
should cease work, and new logins should be disallowed (e.g., with
an /etc/nologin file). It is also possible on some UNIX variants
to remount the partition read-only. For example:
# mount -o ro,remount -n /home
The key is to prevent other processes from overwriting the disk blocks
or inodes previously used by the erased file. This is most pertinent
when the partition is almost full, because it is more likely that
the deleted inodes will be reused. Alternatively, operations can be
performed directly on the partition while mounted read/write but,
as mentioned previously, this reduces the chance of recovery.
The following sections outline how files can be restored from
UNIX filesystems -- the first section describes how a file with
known content can be restored from almost any UNIX variant's
file system, while the second section is specific to restoring files
from the Linux ext2 filesystem.
Known Content File Recovery Process
Once the system is in a safe state, a raw copy of the partition
containing the erased file should be made with the dd(1)
command. Consider that the /etc/passwd file has been accidentally
removed. The following example illustrates the procedure required
to create a copy of the root partition and place it in a file on
the /export partition:
# df -k / /export
Filesystem kbytes used avail capacity Mounted on
/dev/dsk/c0t3d0s0 123231 18413 92495 17% /
/dev/dsk/c1t0d0s0 17498618 14327901 3170717 82% /export
# dd if=/dev/dsk/c0t3d0s0 of=/export/recover.dsk
263088+0 records in
263088+0 records out
# ls -l
-rw-r--r-- 1 root other 134701056 Jul 1 16:54 recover.dsk
The target partition must have sufficient space to hold the entire
slice (in this case ~128 MB). If sufficient space does not exist on
the system in question, then alternative measures, such as NFS mounting
a remote partition or performing operations directly on the partition
device file, can be considered.
Once a copy of the file system has been made, the system can be
brought back into multi-user mode, and normal system operation can
continue while the recovery progresses. If a copy of the partition
cannot be made, leave the system in a safe state and simply substitute
the device name (e.g., /dev/dsk/c0t3d0s0) for the filename
(i.e., /export/recover.dsk) as in examples below.
The procedure that follows is more art than science. Consider
the example of the erased passwd file. The left-hand side of the
following command prints each "line" of the disk with
cat(1) (the -n option prints the line number). The
output is piped to fgrep(1), which performs a "fast"
regular expression search for the "root" entry in the
passwd file:
# cat -n recover.dsk | fgrep "root:x:0:1"
200600 root:x:0:1:Super-User:/:/sbin/sh
202098 root:x:0:1:Super-User:/:/sbin/sh
332802 Ë1 root:x:0:1:Super-User:/:/sbin/sh
fgrep may not match the supplied pattern. This probably means
that the file contents have been overwritten and that recovery is
not possible. Otherwise, the pattern can be further generalized in
the hope of increasing the possibility of a match.
In the above example, there are three places on the disk where
versions of /etc/passwd file have been stored. The GNU version
of grep(1) provides the -A and -B options that
allow a number of lines before/after a particular match to be displayed.
This assists in retrieving the entire contents of the erased file.
If GNU grep(1) is not installed on your system, the C program
(seekcat) in Listing 1 is provided to print out the entire
file from a particular byte or line offset. The seekcat program
takes two arguments: the first is either a -b or -l
flag followed by an integer that specifies a byte or line offset
where the listing should commence. The second argument is a -f
flag followed by the filename, which in this case is the raw disk
image. For example:
# fgrep -A 10 "root:x:0:1" recover.dsk > passwd
# cat passwd
root:x:0:1:Super-User:/:/sbin/sh
daemon:x:1:1::/:
bin:x:2:2::/usr/bin:
...
The above functionality can be emulated by providing the starting
line number to seekcat. The line number argument is derived
from the numbers in the "cat -n" output described earlier
in this section. Essentially, it instructs seekcat to begin
printing the file from where a particular match was found. In the
following example, ten lines of the raw disk image will be printed
from the first match (line 200,600) onwards. If this did not yield
suitable output, then the next match (line 202,098) would be specified:
# seekcat -l 200600 recover.dsk | head -10 > passwd
# cat passwd
root:x:0:1:Super-User:/:/sbin/sh
daemon:x:1:1::/:
bin:x:2:2::/usr/bin:
...
Unfortunately, parts of a file sometimes may be scattered in a non-contiguous
manner over the entire partition. In that case, the above procedure
should be repeated with suitable match patterns to obtain the entire
contents of the file. In most situations (including the erased Perl
script), the entire file tends to be stored in a contiguous fashion
that facilitates easy restoration. The most difficult part is determining
which version stored on disk is the latest file. Unfortunately, this
can only be determined by visual inspection of the content. Fortunately,
CVS or RCS headers that include version numbers, dates, and other
well-known contents can aid this process.
Linux File Recovery Process
The following recovery process covers the case where files have
been deleted using the rm(1) command on a Linux system. In
most cases, this method will result in a perfectly recovered file
and does not rely on trial and error. The inherent advantage of
this method is that binary files or files with unknown content can
be recovered with ease.
The tool used during this process is known as the file system
debugger, debugfs(8), which is used to examine and change
the state of a file system. The examples below are specific to Linux
but should in theory be possible on any UNIX-based file system given
a tool with the functionality provided by debugfs(8). If
such a tool is not available then the known content method of recovery
should be used.
At this point, it is worth mentioning that debugfs is a powerful
but extremely dangerous tool. It provides raw access to the file
system so care must be taken throughout its use.
debugfs(8) provides a shell-like interface that has three
commands that are of interest -- lsdel, cat, and
dump. debugfs(8) can be run on the required partition
as follows:
# debugfs /dev/hda6
debugfs 1.19, 13-Jul-2000 for EXT2 FS 0.5b, 95/08/09
debugfs:
At the prompt, lsdel will give a list of all the deleted inodes
on the filesystem. This will take some time as all the inodes on the
partition need to be scanned:
debugfs: lsdel
1844 deleted inodes found.
Inode Owner Mode Size Blocks Time deleted
749300 1000 100664 27018 2/ 7 Tue May 9 19:08:17 2000
749301 1000 100444 1671 1/ 1 Tue May 9 19:08:17 2000
...... .... ...... .... .. ..........................
944887 1037 100600 597 1/ 1 Sat May 26 18:00:00 2001
717281 1000 100400 1 1/ 1 Sat May 26 18:08:13 2001
32605 1000 100644 15 1/ 1 Sat May 26 18:09:06 2001
Because the list is typically large, it may be useful to redirect
the output to a file that may later be examined with an editor or
pager as follows:
# echo lsdel | debugfs /dev/hda6 > /tmp/lsdel-output
From the information given in the lsdel listing, it should
be possible to narrow down which deleted inodes are relevant. The
most useful fields in the output are -- Owner, Size, Mode, and
Time deleted. If no further disk operations occurred after deleting
the file(s) then the required inodes should be those at the end of
the list. The owner field is the uid of the owner of the file, which
can be found in the third field of /etc/passwd. In this example,
the file was owned by the user johnf, uid 1000, and contained the
string "important_data".
From the listing above, inode 32605 is the last entry and thus
a good candidate for initial investigation. The contents of the
inode can be listed using the following command:
debugfs: cat <32605>
important_data
The deleted file has been found. The dump command can now be
used to write the file out to disk:
debugfs: dump -p <32605> /tmp/recovered_file
The -p ensures that the same owner, group, and permissions
are retained for the file.
This approach is sufficient if a single file has been deleted
but can be tedious if a large number of files must be restored.
Fortunately, Tom Pycke has written a utility called recover [2]
that automates file recovery. The latest version of recover can
be retrieved from:
http://recover.sourceforge.net/linux/recover/download.php3
The installation of recover is fairly straightforward:
# tar zxf recover-1.3.tar.gz
# cd recover-1.3
# make
# make install
Recover is installed into the /usr hierarchy by default. Please
read the README file for instructions on installing into a different
directory hierarchy.
Recover asks a number of simple questions such as:
- Who is the owner of the files?
- When were the files deleted?
- What is the approximate size of the files?
Using this information, it executes debugfs to recover
the inodes that match the given criteria and places them in a user-specified
directory.
Unfortunately, it is not possible to recover filenames in addition
to content and results in a directory full of files named dumpinode-number.
If a directory such as /etc was actually removed, this could
amount to hundreds of files. At this stage, simple UNIX utilities
can be used to sort through the recovered files and the two most
useful tools are strings(1) and file(1). strings(1)
displays sequences of ASCII characters for a given input file and
is useful for extracting text from an otherwise undecipherable binary
file. file(1) is able to determine the type of a file (i.e.,
whether it is a JPEG image or a postscript file). It does this by
performing a set of magic number tests to uniquely determine the
file type.
For example, the output of file(1) in a directory filled
with recovered files is as follows:
# file *
dump34578: directory
dump98568: PGP armored text signed message
dump896545: gzip compressed data, deflated, last modified: Sun
Jan 28 03:31:21 2001, os: Unix
dump78890: ASCII text dump67245: 'diff' output text
dump76345: JPEG file dump9723: MPEG 1.0 layer 3 audio stream
data, 128 kBit/s
dump8976: ASCII C program text
dump57654: Bourne shell script text executable
dump3463: troff or preprocessor input text
dump56789: ELF 32-bit LSB executable, Intel 80386, version 1,
dynamically linked (uses shared libs), stripped
dump9876: ELF 32-bit LSB shared object, Intel 80386,
version 1, stripped
At this point a simple shell script to sort files of different types
by adding file extensions would be useful. For example:
# for i in 'file * | grep "ASCII C program text" | awk -F: '{print $1}'`;
do mv $i $i.c; done
Once sorted, it is time to begin the process of trying to establish
the identity of each file. For text, C, image, and audio files this
is a fairly straightforward process as each file can be visually inspected
with the appropriate tools and the original file name can be guessed.
However, binary files such as executables, libraries, and database
files are more difficult to inspect.
In this situation, the strings(1) utility can be used to
print out any ASCII text strings in a binary file. For example:
# strings dump45678.bin
/lib/ld-linux.so.2
__gmon_start__
libstdc++-libc6.2-2.so.3
_DYNAMIC
__rtti_user
....
/build/buildd/groff-1.16/src/include/stringclass.h
groff.cc
GROFF_COMMAND_PREFIX
troff
....
From the above output, a guess can be made that the file is the groff(1)
executable. Executing the file with a --help argument confirms
this. Libraries are a little more difficult because they cannot be
executed, however, the objdump(1) command can provide suitable
assistance. For example:
# objdump -p dump34756.lib | grep SONAME
SONAME libmenu.so.4
The aforementioned method is only useful if the file has been deleted
through use of a utility such as rm(1) or the unlink(2)
function. If the file has been overwritten, then the inodes that originally
contained the data have probably been overwritten. It is still possible
that some fragments of the data are held elsewhere on disk. For example,
many editors make temporary copies of files that later get deleted,
or perhaps not all the inodes in the file have been overwritten. If
the inodes are lost, then the known content recovery method already
described will be more applicable.
Conclusion
No file recovery method can take the place of thorough, regular,
and reliable system backups. The processes described above are designed
to provide last-ditch assistance to systems administrators perched
on the edge of an abyss.
References
1. Vahalia, U. 1996, UNIX Internals -- The New Frontiers.
Prentice-Hall, Upper Saddle River, NJ
2. Pycke, T., 2001, Recover. http://recover.sourceforge.net/linux/recover/
Liam Widdowson is a consultant at Hewlett-Packard. He can be
contacted at: lbw@telstra.com.
John Ferlito is a senior engineer at Bulletproof Networks.
He can be contacted at: johnf@bulletproof.net.au.
The views expressed in this article do not necessarily represent
those of the authors' employer.
|