PROBLEM SOLVER
Dealing with disaster
by Bob Toxen
Sooner or later every UNIX system will sustain file system damage. It may
be caused by a power failure, a hardware or software failure or an operator
error. The chances of these occurrences may be reduced by using proper
techniques. Preparing for the inevitable reduces the impact of the damage.
When damage does occur, moreover, the proper response may minimize data loss
and prevent further damage. File system damage is like a cancer: unless it is
stopped, it will grow and destroy more and more files.
PREVENTION
A kilobyte of prevention is worth a gigabyte of cure. If power lines are
unreliable or noisy, or if your equipment or data is particularly sensitive,
then investing in an uninterruptible power supply is well worth the
dollar-per-watt cost. It also pays to keep equipment properly maintained.
Be very careful with the files needed for booting. Other system files, too,
should be handled with care. Removing /dev/console or
accidentally entering:
# chmod 666 / usr/file
instead of:
# chmod 666 /usr/file
can be disastrous. The former will instantly render the root file system
unusable and unbootable, since it takes execute (directory search) permission
away from the entire file system -- except for references relative to the
current directory that do not go through the root directory.
Make sure knowledgeable people know how to reset the system, know how to turn
the system off, and understand the tradeoffs of connecting other equipment to
the same electrical circuit (causing electrical noise or an overload).
Ignorance is not bliss but an accident waiting to happen.
MINIMIZING THE IMPACT
The best and simplest way to minimize the impact of a crash is to perform
frequent file system backups. Backups should usually be done every day or two
and certainly weekly (unless data is static). Users should be encouraged to
record their valuable data on either a tape (or floppy), another system, a
different disk or, if necessary, on the same disk. Backup tapes (or floppies
or disks) should be read periodically to make sure they are readable. Some
people have learned this lesson the hard way. Alternate between at least two
tapes (or sets of floppies) in case the system crashes in the middle of a
backup, destroying both the disk and tape. Store some backups off-site to
guard against fire, earthquake and sabotage.
Make provisions for easy recovery in the event the system will not boot. One
method is to have some way of booting off a different disk. Another common
method is to provide a way to backup the disks with standalone utilities that
can be booted from instead of the default UNIX kernel. You could also provide
a way to overwrite the disk with a bootable UNIX system and essential
standalone utilities that should be bootable from tape (or floppy). There
are 10 files needed for UNIX to boot, including:
/unix (name may vary)
/dev/console
/dev/md0a (name may vary)
/dev/swap
/etc/init
/etc/inittab (System III & System V)
/etc/rc (System III)
/bin/sh
/bin/csh (Some configurations)
/bin/su (System V)
Also, all directories leading to these files must be readable and executable
by all. Some versions and implementations will need different files. One way
to find these files is to reboot the system (after properly shutting down) and
issuing the command:
# ls -lut / /bin /dev /etc
as soon as you get a single user prompt. This will list the files in the
specified directories with the time they were last accessed (read, written or
executed as a program) -- sorted by access time. Those files with an access
time after the time the system was shut down are probably those needed for
booting. For systems with several disks, these critical files should be
duplicated on a second disk, and the capability of booting from that disk
should be provided. In most implementations, one can boot off any disk or
tape.
SHUTTING DOWN THE SYSTEM
- Make sure everyone is logged off including those on dialups and nets.
- Make sure that printers, tape drives and other peripherals are inactive.
- Make sure UUCP and similar networking programs are
inactive.
-
Make sure various daemons such as mailers, news and networks are inactive.
- Take the system down to single user mode.
- Do a ps and kill any process besides process 1, your
shell and ps. Do another ps to verify that
they all went away. A kill -9 may be needed. Don't worry
about gettys that do not go away. This is a harmless problem
caused by a defective tty driver. International Technical
Seminars offers an excellent class on how to write drivers correctly.
- Issue a sync command.
- Turn off the system or press the reset button.
People who have the reboot (or facsimile) program may use it
in place of steps 5 through 8.
WHAT FILE SYSTEM DAMAGE MEANS
In addition to the data in everyone's files, UNIX must keep track of the names
of files, their permissions, ownership, time of last modification, links,
directories, unused disk portions (free blocks and free inode numbers), counts
of files and counts of blocks of data that will fit in each file system -- as
well as an assortment of other concerns. When changes are made to such
things, they are not immediately written out to disk but instead are kept in
memory. If anyone wants to read any of this changed data, UNIX knows to use
the copy in memory rather than the old copy on disk. Likewise, any new
changes will affect the copy in memory.
Some portions such as the superblock, which keeps track of free blocks, free
inodes and such, and the /tmp directory, which contains the
temporary files used by editors and compilers, change often. Other areas,
such as the /bin directory, do not change often but are read
often (every time you execute a program). By keeping this rapidly changing or
frequently read data in memory rather than having to read it continually from
disk and write it back, UNIX runs much faster than it would otherwise.
This buffer area in memory is limited by a fixed size. If there isn't room to
fit in some new data that someone wants to read from or write to a disk, then
a portion of the buffer will be written to disk -- if a change has been made.
That portion of the buffer then will be available for new data.
There is usually some changed data sitting in the memory buffer waiting to be
written to disk. UNIX is in no hurry to write this data to disk. Why should
it? If anyone wants it, they can get the memory copy of it. The only problem
is that if the system crashes, the disk will contain some old data.
If some of this old data is information on whether a particular block of data
is free; is contained in a file; is a list of where the data for a particular
file is kept on disk; or is a list of files in a directory, then UNIX will be
confused when it is rebooted.
Suppose you just created a file with
vi. Imagine that the block on the disk that records the
place where this file's data is kept is written to disk and that the actual
data blocks are also written on the disk. If the system then crashes, the
block that records the file's data blocks will have been allocated to a file
(rather than being unused), and the data block of the directory that this file
was created in will not have been written to disk.
If you then reboot, you
will not be able to access that file because its name is not in the directory.
Also, if you create another file, it may use the same blocks that were used by
your first file, destroying the first file's data. This is why, when
rebooting after a crash, fsck must be invoked
immediately -- before the file system is changed further.
RECOVERING FROM A CRASH
First, log the crash in the system logbook. Include any error messages and
any other significant items that will help determine the cause of the crash --
thus minimizing the impact of future crashes of a similar nature. For
example, if the system crashed with the error messages:
panic: IO err in swap
displayed several times, one would suspect that either the disk used for
swapping or its related controller, device driver or the like was having
problems. Similarly, the message:
panic: parity
appearing more often than, say, once a month probably indicates memory
hardware problems. In most implementations, the message will also tell which
section of memory the parity error has occurred in. After several such
panics, a field service engineer may be able to see a pattern and determine
which section of memory should be replaced.
After logging a crash, reboot the system by pressing the reset button or
performing whatever routine you normally use to start up your system. It
should come up in single user mode. The very first thing to do at this point
is to run fsck -- as in
file
system
consistency checker.
It will check each of your file systems to make sure they are not corrupt.
This is usually done with the command:
# /etc/fsck
Some systems are configured to start up fsck automatically.
The fsck command will read the file
/etc/checklist to get a list of file systems to check.
The /etc/checklist file is a text file that may be edited
with vi. It contains the name of each disk that contains
each file system, one name per line. The first line should have the name of
the root file system. This name (on every line except the first) should be
the same name used in a mount command -- except that there
should be a small letter "r" after /dev/.
For example, if your root file system is on /dev/md0a and you
issue the mount commands:
# /etc/mount /dev/md0c /usr
# /etc/mount /dev/md1a /mnt
# /etc/mount /dev/md1c /image
# /etc/mount /dev/md1b /tmp
then your /etc/checklist file should look like:
/dev/md0a
/dev/rmd0c
/dev/rmd1a
/dev/rmd1c
/dev/rmd1b
If /etc/checklist is not configured, list the file systems to
be checked on the command line, like so:
# /etc/fsck /dev/md0a /dev/rmd0c /dev/rmd1[abc]
The fsck command will then read through each file system
(from disk) and check for inconsistencies, such as blocks that are both on
the free list and in a file or files that don't appear in any directory. Each
time fsck finds something wrong, it will indicate what the
problem is and ask whether it should be fixed. Almost always you will want to
type the letter "y" (for yes) followed by RETURN. One case
where you might want to type "n" (for no) is when
fsck asks you whether it should delete a file and gives only
its inode number rather than its name. You will want to find out the file's
name before it is removed so you can recover it from backup tape. To find out
the name of inode number 387 on md0c, give the commands:
# /etc/mount /dev/md0c /usr
# find /usr -inum 387 -print
# /etc/umount /dev/md0c
Another time to say no is when fsck asks for permission to
remove /unix or another equally important system file.
Recovering from this or other more complex problems is beyond the scope of
this article.
Bob Toxen is a member of the technical staff at Silicon Graphics, Inc.
He has gained a reputation as a leading uucp expert and is
responsible for ports of System V for the Zilog 8000 and System III for the
Motorola 68000.
Copyright © 1984, 2007, 2014, 2020 Robert M. Toxen. All rights reserved.
Back
|