当前位置:文档之家› HP-UX_11.31_Boot_Mirrored_SAS_Disk_Replacement

HP-UX_11.31_Boot_Mirrored_SAS_Disk_Replacement

HP-UX_11.31_Boot_Mirrored_SAS_Disk_Replacement
HP-UX_11.31_Boot_Mirrored_SAS_Disk_Replacement

3. HP-UX 11.31

3.2 Boot Disks

3.2.1 Mirrored

3.2.1.1 on SAS Controllers

Overview

SAS controllers use a different addressing scheme than parallel SCSI like U320 or SCSI-2.

Where SCSI ID and therefore disk slot numbers were important, SAS uses a unique address of the disks themselves to identify the disks as part of a LUN (logical unit) or an HP-UX

special device file like /dev/dsk/c0t0d0. You can take a disk out of one slot (referred to as

bays) and put it in a different slot or bay. The controller finds it and presents the disk to the

O/S with the same special device file. This does complicate the procedure needed for

replacing a failed disk. !

When a disk fails, and a new disk is put into the same bay that the failed disk came out of, the SAS controller knows it is a different disk by its SAS address. The O/S driver assigns the next available target for the hardware path (viewed with ioscan) and special device file if insf -e is

executed.

If the customer used a legacy devicefile (e.g. /dev/dsk/c2t2d0) in his LVM configuration, you have to make sure that the new disk will get the old legacy file again.

This can be done by sasmgr(1M) with the “replace_tgt” option. Even this will result in an

Error that the file is still busy, a consecutive “io_redirect_dsf” on the corresponding persisten

file will round up the move, and result in the old legacy/persisten name pair as before.

If the customer used a persistent devicefile (e.g. /dev/disk/disk5) in his LVM config, you

have to make sure that the new disk uses the same persistent devicefile again. Ich you

don′t care about the new created legacy dsf and only use the persistent one, just only use

“io_redirect_dsf(1M).

Both methods, sasmgr replace_tgt and io_redirect_dsf, are very simple as long as there are

no I/O’s pending or no I/O drivers that have the special device file open for reading/writing.

This is rarely the case however, and LVM will continue to try to access that special device file waiting for the failed disk to return.

You must stop all access to the special device file first by executing “pvchange –a N” to

deactivate that physical volume.

Once the new disk is inserted, create the EFI partitions, and vgcfgrestore the LVM

information.

As an alternative, you can unmirror the volume group and vgreduce the bad disk from the

volume group. Following is the procedure using pvchange and vgcfgrestore to replace a

mirrored boot disk in vg00.

1. Check which disk has failed and which devicefiles are in use

The SAS controller still sees the disk and has no Raids configured. We have a JBOD disk

that, when replaced with a new one, surely will result in new legacy and persistent devicefiles

Failed disk New disk

Type SAS-JBOD SAS-JBOD

HW Path 0/2/1/0.0.0.1.0T.B.D.

LunPath 0/2/1/0.0x5000c5000820ee41.0x0T.B.D.

Legacy Devicefile /dev/dsk/c0t1d0T.B.D.

Persistent Devicefile /dev/disk/disk3T.B.D.

3. Turn on the disk’s locator LED of disk Bay 2 to ensure the correct disk is removed:

4. Replace the Disk

At this point the disk in bay 2 is pulled out of the server .

a new disk is inserted in the same bay. The server should not be rebooted or taken down

between the time the disk fails and the time the new disk is inserted

Failed disk highlighted in red (State NO_HW)

mirror of that disk in blue (still operating, CLAIMED and untouched)

New replacement disk is in green. (Got a new target ID “2” by SAS Controller !!!!)

5. Check for new created devicefiles

In this case the customer uses persistent devicefiles, so we don’t care about the new legacy

devicefile, but instead check whick new persistent device the system has created for it:

# ioscan -fnH 0/2

Class I H/W Path Driver S/W State H/W Type Description

============================================================================== ba 2 0/2 lba CLAIMED BUS_NEXUS Local PCI-X Bus Adapter (122e)

escsi_ctlr 0 0/2/1/0 sasd CLAIMED INTERFACE HP PCI/PCI- X SAS MPT Adapter

/dev/sasd0

ext_bus 0 0/2/1/0.0.0 sasd_vbus CLAIMED INTERFACE SAS Device I nterface

target 1 0/2/1/0.0.0.0 tgt CLAIMED DEVICE

disk 1 0/2/1/0.0.0.0.0 sdisk CLAIMED DEVICE HP DG 146ABAB4

/dev/dsk/c0t0d0 /dev/rdsk/c0t0d0

/dev/dsk/c0t0d0s1 /dev/rdsk/c0t0d0s1

/dev/dsk/c0t0d0s2 /dev/rdsk/c0t0d0s2

/dev/dsk/c0t0d0s3 /dev/rdsk/c0t0d0s3

target 0 0/2/1/0.0.0.1 tgt NO_HW DEVICE

disk 0 0/2/1/0.0.0.1.0 sdisk NO_HW DEVICE HP DG 146ABAB4

/dev/dsk/c0t1d0 /dev/rdsk/c0t1d0

/dev/dsk/c0t1d0s1 /dev/rdsk/c0t1d0s1

/dev/dsk/c0t1d0s2 /dev/rdsk/c0t1d0s2

/dev/dsk/c0t1d0s3 /dev/rdsk/c0t1d0s3

target 1 0/2/1/0.0.2.0 tgt CLAIMED DEVICE

disk 1 0/2/1/0.0.2.0.0 sdisk CLAIMED DEVICE HP DG 146ABAB4

/dev/dsk/c0t2d0 /dev/rdsk/c0t2d0

# sasmgr get_info -D /dev/sasd0 -q raid

Thu Dec 14 14:59:28 2006

---------- PHYSICAL DRIVES ----------

LUN dsf SAS Address Enclosure Bay Size(MB) /dev/rdsk/c0t0d0 0x5000c50008210fa5 1 1 140014

/dev/rdsk/c0t2d0 0x5000cca000101799 1 2 140014

# ioscan -m dsf

Persistent DSF Legacy DSF(s)

========================================

/dev/rdisk/disk2 /dev/rdsk/c0t0d0

/dev/rdisk/disk2_p1 /dev/rdsk/c0t0d0s1

/dev/rdisk/disk2_p2 /dev/rdsk/c0t0d0s2

/dev/rdisk/disk2_p3 /dev/rdsk/c0t0d0s3

/dev/rdisk/disk3 /dev/rdsk/c0t1d0

/dev/rdisk/disk3_p1 /dev/rdsk/c0t1d0s1

/dev/rdisk/disk3_p2 /dev/rdsk/c0t1d0s2

/dev/rdisk/disk3_p3 /dev/rdsk/c0t1d0s3

/dev/rdisk/disk5 /dev/rdsk/c0t2d0 <- new disk !

# ioscan -m lun

Class I Lun H/W Path Driver S/W State H/W Type Health Description =============================================================================== disk 2 64000/0xfa00/0x0 esdisk CLAIMED DEVICE online HP

0/2/1/0.0x5000c50008210fa5.0x0

/dev/disk/disk2 /dev/rdisk/disk2

/dev/disk/disk2_p1 /dev/rdisk/disk2_p1

/dev/disk/disk2_p2 /dev/rdisk/disk2_p2

/dev/disk/disk2_p3 /dev/rdisk/disk2_p3

Collect your data now :

Failed disk New disk

Type SAS-JBOD SAS-JBOD

HW Path 0/2/1/0.0.0.1.0 0/2/1/0.0.0.2.0

LunPath 0/2/1/0.0x5000c5000820ee41.0x0 0/2/1/0. 0x5000cca000101799.0x0 Legacy Devicefile /dev/dsk/c0t1d0 /dev/dsk/c0t2d0

Persistent Devicefile /dev/disk/disk3 /dev/disk/disk5

6. Stop the LED (now using the new legacy devicefile)

7. Restore the IA64 Partitioning Scheme of the new boot disk

Note: As the tools to retain the old defvcefiles after disk replacement do only allow

replacement of disks with identical number of devicefiles, You have to make sure now that the

new disk has the same partitioning scheme as the failed one.

If you would try to move the disk5 to the old disk3 name, you will get an Error Message:

Example:

# io_redirect_dsf -d /dev/rdisk/disk3 -n /dev/rdisk/disk5

Number of old DSFs=8.

Number of new DSFs=2.

The number of old and new DSFs must be the same.

Be aware that you use the new created devicefiles at this time.

- Create a partition description file:

- Create the new device files for the new partitions (disk3_p1,_p2_p3)

# insf -e –Cdisk

# ioscan -m lun

Class I Lun H/W Path Driver S/W State H/W Type Health Description

======================================================================

disk 2 64000/0xfa00/0x0 esdisk CLAIMED DEVICE online HP

0/2/1/0.0x5000c50008210fa5.0x0

/dev/disk/disk2 /dev/rdisk/disk2

/dev/disk/disk2_p1 /dev/rdisk/disk2_p1

/dev/disk/disk2_p2 /dev/rdisk/disk2_p2

/dev/disk/disk2_p3 /dev/rdisk/disk2_p3

disk 3 64000/0xfa00/0x1 esdisk NO_HW DEVICE online HP

0/2/1/0.0x5000c5000820ee41.0x0

/dev/disk/disk3 /dev/rdisk/disk3

/dev/disk/disk3_p1 /dev/rdisk/disk3_p1

/dev/disk/disk3_p2 /dev/rdisk/disk3_p2

/dev/disk/disk3_p3 /dev/rdisk/disk3_p3

disk 5 64000/0xfa00/0x2 esdisk CLAIMED DEVICE online HP 0/2/1/0. 0x5000cca000101799.0x0

/dev/disk/disk5 /dev/rdisk/disk5

/dev/disk/disk5_p1 /dev/rdisk/disk5_p1

/dev/disk/disk5_p2 /dev/rdisk/disk5_p2

/dev/disk/disk5_p3 /dev/rdisk/disk5_p3

# ioscan -fnH 0/2

Class I H/W Path Driver S/W State H/W Type Description

==============================================================================

ba 2 0/2 lba CLAIMED BUS_NEXUS Local PCI-X escsi_ctlr 0 0/2/1/0 sasd CLAIMED INTERFACE HP PCI/PCI-X SAS MPT Adapter

/dev/sasd0

ext_bus 0 0/2/1/0.0.0 sasd_vbus CLAIMED INTERFACE SAS Device I nterface

target 1 0/2/1/0.0.0.0 tgt CLAIMED DEVICE

disk 1 0/2/1/0.0.0.0.0 sdisk CLAIMED DEVICE HP DG 146ABAB4

/dev/dsk/c0t0d0 /dev/rdsk/c0t0d0

/dev/dsk/c0t0d0s1 /dev/rdsk/c0t0d0s1

/dev/dsk/c0t0d0s2 /dev/rdsk/c0t0d0s2

/dev/dsk/c0t0d0s3 /dev/rdsk/c0t0d0s3

target 0 0/2/1/0.0.0.1 tgt NO_HW DEVICE

disk 0 0/2/1/0.0.0.1.0 sdisk NO_HW DEVICE HP DG 146ABAB4

/dev/dsk/c0t1d0 /dev/rdsk/c0t1d0

/dev/dsk/c0t1d0s1 /dev/rdsk/c0t1d0s1

/dev/dsk/c0t1d0s2 /dev/rdsk/c0t1d0s2

/dev/dsk/c0t1d0s3 /dev/rdsk/c0t1d0s3

target 1 0/2/1/0.0.2.0 tgt CLAIMED DEVICE

disk 1 0/2/1/0.0.2.0.0 sdisk CLAIMED DEVICE HP DG 146ABAB4

/dev/dsk/c0t2d0 /dev/rdsk/c0t2d0

/dev/dsk/c0t2d0s1 /dev/rdsk/c0t2d0s1

/dev/dsk/c0t2d0s2 /dev/rdsk/c0t2d0s2

/dev/dsk/c0t2d0s3 /dev/rdsk/c0t2d0s3

# ioscan -m dsf

Persistent DSF Legacy DSF(s)

========================================

/dev/rdisk/disk2 /dev/rdsk/c0t0d0

/dev/rdisk/disk2_p1 /dev/rdsk/c0t0d0s1

/dev/rdisk/disk2_p2 /dev/rdsk/c0t0d0s2

/dev/rdisk/disk2_p3 /dev/rdsk/c0t0d0s3

/dev/rdisk/disk3 /dev/rdsk/ c0t1d0 <- failed disk !

/dev/rdisk/disk3_p1 /dev/rdsk/c0t1d0s1

/dev/rdisk/disk3_p2 /dev/rdsk/c0t1d0s2

/dev/rdisk/disk3_p3 /dev/rdsk/c0t1d0s3

/dev/rdisk/disk5 /dev/rdsk/c0t2d0 <- new disk !

/dev/rdisk/disk5_p1 /dev/rdsk/c0t2d0s1

/dev/rdisk/disk5_p2 /dev/rdsk/c0t2d0s2

/dev/rdisk/disk5_p3 /dev/rdsk/c0t2d0s3

8. Redirect the IO from the new Device to the old Devicefile

- for the legacy devicefile:

What happened ? Since there is still an active mapping from “c0t2d0” to the persistent dsf

“disk5”, the system complains that you renamed one part of it. But, it alteady deleted the new

legacy dsf′s:

Now, let′s re-animate the persisten devicefile also:

- for the persisten devicefile:

- Verify it:

# ioscan -m lun

Class I Lun H/W Path Driver S/W State H/W Type Health Description

=================================================================== ===

disk 2 64000/0xfa00/0x0 esdisk CLAIMED DEVICE online HP

0/2/1/0.0x5000c50008210fa5.0x0

/dev/disk/disk2 /dev/rdisk/disk2

/dev/disk/disk2_p1 /dev/rdisk/disk2_p1

/dev/disk/disk2_p2 /dev/rdisk/disk2_p2

/dev/disk/disk2_p3 /dev/rdisk/disk2_p3

disk 5 64000/0xfa00/0x2 esdisk CLAIMED DEVICE online HP

0/2/1/0. 0x5000cca000101799.0x0

/dev/disk/disk3 /dev/rdisk/disk3

/dev/disk/disk3_p1 /dev/rdisk/disk3_p1

/dev/disk/disk3_p2 /dev/rdisk/disk3_p2

/dev/disk/disk3_p3 /dev/rdisk/disk3_p3

# ioscan -fnH 0/2

Class I H/W Path Driver S/W State H/W Type Description

========================================================================== ====

ba 2 0/2 lba CLAIMED BUS_NEXUS Local PCI-X

escsi_ctlr 0 0/2/1/0 sasd CLAIMED INTERFACE HP

PCI/PCI-X SAS MPT Adapter

/dev/sasd0

ext_bus 0 0/2/1/0.0.0 sasd_vbus CLAIMED INTERFACE SAS Device I

nterface

target 1 0/2/1/0.0.0.0 tgt CLAIMED DEVICE

disk 1 0/2/1/0.0.0.0.0 sdisk CLAIMED DEVICE HP DG146ABAB4 /dev/dsk/c0t0d0 /dev/rdsk/c0t0d0

/dev/dsk/c0t0d0s1 /dev/rdsk/c0t0d0s1

/dev/dsk/c0t0d0s2 /dev/rdsk/c0t0d0s2

/dev/dsk/c0t0d0s3 /dev/rdsk/c0t0d0s3

target 0 0/2/1/0.0.0.1 tgt CLAIMED DEVICE

Bingo ! The old dsf′s “c0t0d0 and “disk3” are operational again ! J

9. Initialize the EFI FAT Partition and fill boot areas:

contain a valid EFI filesystem. In this case efi_fsinit(1M) will be done automatically by the

subsequent mkboot(1M) command. But if you take e.g. an old UX 11.22 boot disk as mirror disk, mkboot will not automatically run efi_fsinit. As a result only 100MB of the 500MB EFI partition (s1) can be used.

- Use mkboot(1M) to format the EFI partition (s1) and populate it with the EFI files below

/usr/lib/efi/ and to format the LIF volume (part of s2) and populate it with the LIF files (ISL,

NOTE: Specify the -lq option if prefer that your system boots up without

interruption in case of a disk failure:

10. Restore LVM Configuration

Now the new disk is partitioned and equipped with boot headers, you can restore the LVM data to the OS partition “p2!

- Restore LVM access to the disk.

Reattach the disk by reactivating the volume group as follows:

NOTE: The vgchange command with the option can be run on a volume

group that is deactivated or already activated. It attaches all paths for all disks in

the volume group and resumes automatically recovering any disks in the volume

group that had been offline or any disks in the volume group that were replaced.

Therefore, run vgchange only after all work has been completed on all disks and

paths in the volume group, and it is necessary to attach them all.

Initialize/check boot information on the disk.

- Check if content of LABEL file (i.e. root, boot, swap and dump device definition) has been

- Primary Path

- HAAlternate Path

- Alternate Path

_____________________________________________________________________

Sources:

HP-UX System Administrator's Guide:Logical Volume ManagementHP-UX 11i Version 3 When Good Disks Go Bad: Dealing with Disk Failures under LVM

SAS Physical Disk Replacement Procedure With LVM Mirroring by Jay Duffield Software Recovery Handbook:Itanium Architecture:How to mirror the Boot Disk (ECU copy)

HP 8 Internal SAS Controller Support Guide

adapted for HW Recovery Workshop 2008 by Roland Luechtenberg

相关主题
文本预览
相关文档 最新文档