I have had a few questions about some of my previous blog
entries so I thought I would try to write up some general responses
to the various questions.
First, here are the bugids for some of the bugs I mentioned
in my previous posts:
6215065 Booting off single disk from mirrored root pair causes panic reset
This is the UFS logging bug that you can hit when you don’t have
metadb quorum.
6236382 boot should enter single-user when there is no mddb quorum
This one is pretty self explanatory. The system will boot all
the way to multi-user, but root remains read-only. To recover
from this, delete the metadb on the dead disk and reboot so
that you have metadb quorum.
6250010 Cannot boot root mirror
This is the bug where the V20z and V40z don’t use the altbootpath
to failover when the primary boot disk is dead.
There was a question about what to do if you get into the infinite
panic-reboot cycle. This would happen if you were hitting bug 6215065.
To recover from this, you need to boot off of some other media
so that you can cleanup. You could boot off of a Solaris netinstall image
or the the Solaris install CD-ROM for example. Once you do that
you can mount the disk that is still ok and change the /etc/vfstab
entry so that the root filesystem is mounted without logging. Since
logging was originally in use, you want to make sure the log is rolled
and that UFS stops using the log. Here are some example commands
to do this:
# mount -o nologging /dev/dsk/c1t0d0s0 /a
# [edit /a/etc/vfstab; change the last field for / to “nologging”]
# umount /a
# reboot
By mounting this way UFS should roll the existing log and then mark
the filesystem so that logging won’t be used for the current mount on /a.
This kind of recovery procedure where you boot off of the install image
should only be used when one side of the root mirror is already dead.
If you were to use this approach for other kinds of recovery you
would leave the mirror in an inconsistent state since only one side
was actually modified.
Here is a general procedure for accessing the root mirror when
you boot off of the install image. An example where you might use
this procedure is if you forgot the root password and wanted to
mount the mirror so you could clear the field in /etc/shadow.
First, you need to boot off the install image and get a copy of
the md.conf file from the root filesystem.
Mount one of the underlying root mirror disks read-only to get a copy.
# mount -o ro /dev/dsk/c0t0d0s0 /a
# cp /a/kernel/drv/md.conf /kernel/drv/md.conf
# umount /a
Now update the SVM driver to load the configuration.
# update_drv -f md
[ignore any warning messages printed by update_drv]
# metainit -r
If you have mirrors you should run metasync
to get them synced.
Your SVM metadevices should now be accessible and you should
be able to mount them and perform whatever recovery you need.
# mount /dev/md/dsk/d10 /a
We are also getting this simple procedure into our docs so that
it is easier to find.
Finally, there was another comment about using a USB memory disk
to hold a copy of the metadb so that quorum would be maintained even
if one of the disks in the root mirror died.
This is something that has come up internally in the past but nobody on our
team had actually tried this so I went out this weekend and bought a
USB memory disk to see if I could make this work.
It turns out this worked fine for me, but there are a lot of variables
so your mileage may vary.
Here is what worked for me. I got a 128MB memory disk since this
was the cheapest one they sold and you don’t need much space for a
copy of the metadb (only about 5MB is required for one copy).
First I used the “rmformat -l” command to figure out the name of the disk.
The disk came formatted with a pcfs filesystem already on it and
a single fdisk partition of type DOS-BIG. I used the fdisk(1M)
command to delete that partition and create a single Solaris fdisk
partition for the whole memory disk.
After that I just put a metadb on the disk.
# metadb -a /dev/dsk/c6t0d0s2
Once I did all of this, I could reboot with one of the disks removed
from my root mirror and I still had metadb quorum since I had a 3rd
copy of the metadb available.
There are a several caveats here. First, I was running this on
a current nightly build of Solaris. I haven’t tried it yet on the
shipping Solaris 10 bits but I think this will probably work. Going
back to the earlier S9 bits I would be less certain since a lot of
work went into the USB code for S10. The main thing here is that
the system has to see the USB disk at the time that SVM driver is reading
the metadbs. This happens fairly early in the boot sequence. If
we don’t see the USB disk, then that metadb replica will be marked
in error and it won’t help maintain quorum.
The second thing to watch out for is that SVM keeps track of mirror
resync regions in some of the copies of the metadbs. This is used
for the optimized resync feature that SVM supports. Currently there
is no good way to see which metadbs SVM is using for this and there
is no way to control which metadbs will be used. You wouldn’t want
these writes going to the USB memory disk since it will probably
wear out faster and might be slower too. We need to improve this
in order for the USB memory disk to really be a production quality
solution.
Another issue to watch out for is if your hardware supports the
memory disk. I tried this on an older x86 box and it couldn’t see
the USB disk. I am pretty sure the older system did not support USB 2.0
which is what the disk I bought supports. This worked fine when I tried it
on a V65 and on a Dell 650, both of which are newer systems.
I need to do more work in this area before we could really recommend
this approach, but it might be a useful solution today for people
who need to get around the metadb quorum problem on a two disk
configuration. You would really have to play around to make sure
the configuration worked ok in your environment.
We are also working internally on some other solutions to this same problem.
Hopefully I can write more about that soon.